stevedlawrence commented on a change in pull request #560:
URL: https://github.com/apache/daffodil/pull/560#discussion_r637899469
##########
File path:
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/DataProcessor.scala
##########
@@ -347,7 +346,11 @@ class DataProcessor private (
override def getDiagnostics = ssrd.diagnostics
- override def newXMLReaderInstance: DFDL.DaffodilParseXMLReader = new
DaffodilParseXMLReader(this)
+ override def newXMLReaderInstance: DFDL.DaffodilParseXMLReader = {
+ val xrdr = new DaffodilParseXMLReader(this)
+ XMLUtils.setSecureDefaults(xrdr)
Review comment:
Thoughts on just removing this call? We aren't getting a random
XMLReader here based on some factory that happens to be on the classpath. We
know it's a Daffodil specific XML reader, and we know it's secure from these
XML issues. Removing this would potentially mean we could remove the new logic
added in set/getFeature to accept features that we don't really support.
##########
File path:
daffodil-core/src/main/scala/org/apache/daffodil/runtime1/SchemaSetRuntime1Mixin.scala
##########
@@ -72,37 +74,61 @@ trait SchemaSetRuntime1Mixin { self : SchemaSet =>
}.value
def onPath(xpath: String): DFDL.DataProcessor = {
- Assert.usage(!isError)
- if (xpath != "/") root.notYetImplemented("""Path must be "/". Other path
support is not yet implemented.""")
- val rootERD = root.elementRuntimeData
- root.schemaDefinitionUnless(
- rootERD.outputValueCalcExpr.isEmpty,
- "The root element cannot have the dfdl:outputValueCalc property.")
- val validationMode = ValidationMode.Off
- val p = if (!root.isError) parser else null
- val u = if (!root.isError) unparser else null
- val ssrd = new SchemaSetRuntimeData(
- p,
- u,
- this.diagnostics,
- rootERD,
- variableMap,
- typeCalcMap)
- if (root.numComponents > root.numUniqueComponents)
- log(LogLevel.Info, "Compiler: component counts: unique %s, actual %s.",
- root.numUniqueComponents, root.numComponents)
- val dataProc = new DataProcessor(ssrd, tunable,
self.compilerExternalVarSettings)
- if (dataProc.isError) {
- // NO longer printing anything here. Callers must do this.
- // val diags = dataProc.getDiagnostics
- // log(LogLevel.Error,"Compilation (DataProcessor) reports %s
compile errors/warnings.", diags.length)
- // diags.foreach { diag => log(LogLevel.Error, diag.toString())
}
- } else {
- log(LogLevel.Compile, "Parser = %s.", ssrd.parser.toString)
- log(LogLevel.Compile, "Unparser = %s.", ssrd.unparser.toString)
- log(LogLevel.Compile, "Compilation (DataProcesor) completed with no
errors.")
- }
- dataProc
+ Assert.usage(!isError)
+ if (xpath != "/") root.notYetImplemented("""Path must be "/". Other path
support is not yet implemented.""")
+ val rootERD = root.elementRuntimeData
+ root.schemaDefinitionUnless(
+ rootERD.outputValueCalcExpr.isEmpty,
+ "The root element cannot have the dfdl:outputValueCalc property.")
+ val validationMode = ValidationMode.Off
+ val p = if (!root.isError) parser else null
+ val u = if (!root.isError) unparser else null
+ val ssrd = new SchemaSetRuntimeData(
+ p,
+ u,
+ this.diagnostics,
+ rootERD,
+ variableMap,
+ typeCalcMap)
+ if (root.numComponents > root.numUniqueComponents)
+ log(LogLevel.Info, "Compiler: component counts: unique %s, actual %s.",
+ root.numUniqueComponents, root.numComponents)
+ val dataProc = new DataProcessor(ssrd, tunable,
self.compilerExternalVarSettings)
+ //
+ // now we fake serialize to a dev/null-type output stream which forces
+ // any lazy evaluation that hasn't completed to complete.
+ // Those things could signal errors, so we do this before we check for
errors.
+ //
+ // Note that calling preSerialization is not sufficient, since that's only
mixed into
+ // objects with lazy evaluation. A SSRD is just a tuple-like object, does
not mixin
+ // preSerialization, and shouldn't need to. We need to
+ // serialize all its substructure to insure all preSerializations, that
force
+ // all lazy evaluations, are done.
+ //
+ // Overhead-wise, this is costly, if the caller is about to save the
processor themselves
+ // But as there have been cases of Runtime1 processors which end up doing
lazy evaluation
+ // that ends up happening late, this eliminates a source of bugs, albeit,
by masking them
+ // so they are not detectable.
+ //
+ // Best to address this for real when we refactor Runtime1 to fully
separate it from
+ // the schema compiler. At that point we can draw a firmer line about the
compiler's output
+ // being fully realized before runtime objects are constructed.
+ //
+ // We don't call save() here, because that does a few other things than
just serialize.
+ val oos = new
ObjectOutputStream(org.apache.commons.io.output.NullOutputStream.NULL_OUTPUT_STREAM)
+ oos.writeObject(dataProc)
+
+ if (dataProc.isError) {
+ // NO longer printing anything here. Callers must do this.
+ // val diags = dataProc.getDiagnostics
+ // log(LogLevel.Error,"Compilation (DataProcessor) reports %s
compile errors/warnings.", diags.length)
+ // diags.foreach { diag => log(LogLevel.Error, diag.toString()) }
+ } else {
+ log(LogLevel.Compile, "Parser = %s.", ssrd.parser.toString)
+ log(LogLevel.Compile, "Unparser = %s.", ssrd.unparser.toString)
+ log(LogLevel.Compile, "Compilation (DataProcesor) completed with no
errors.")
+ }
+ dataProc
}
-}
+}
Review comment:
This file is missing a new line. Not sure why that would be removed. Did
you IDE settings change?
##########
File path:
daffodil-core/src/main/scala/org/apache/daffodil/runtime1/SchemaSetRuntime1Mixin.scala
##########
@@ -72,37 +74,61 @@ trait SchemaSetRuntime1Mixin { self : SchemaSet =>
}.value
def onPath(xpath: String): DFDL.DataProcessor = {
- Assert.usage(!isError)
- if (xpath != "/") root.notYetImplemented("""Path must be "/". Other path
support is not yet implemented.""")
- val rootERD = root.elementRuntimeData
- root.schemaDefinitionUnless(
- rootERD.outputValueCalcExpr.isEmpty,
- "The root element cannot have the dfdl:outputValueCalc property.")
- val validationMode = ValidationMode.Off
- val p = if (!root.isError) parser else null
- val u = if (!root.isError) unparser else null
- val ssrd = new SchemaSetRuntimeData(
- p,
- u,
- this.diagnostics,
- rootERD,
- variableMap,
- typeCalcMap)
- if (root.numComponents > root.numUniqueComponents)
- log(LogLevel.Info, "Compiler: component counts: unique %s, actual %s.",
- root.numUniqueComponents, root.numComponents)
- val dataProc = new DataProcessor(ssrd, tunable,
self.compilerExternalVarSettings)
- if (dataProc.isError) {
- // NO longer printing anything here. Callers must do this.
- // val diags = dataProc.getDiagnostics
- // log(LogLevel.Error,"Compilation (DataProcessor) reports %s
compile errors/warnings.", diags.length)
- // diags.foreach { diag => log(LogLevel.Error, diag.toString())
}
- } else {
- log(LogLevel.Compile, "Parser = %s.", ssrd.parser.toString)
- log(LogLevel.Compile, "Unparser = %s.", ssrd.unparser.toString)
- log(LogLevel.Compile, "Compilation (DataProcesor) completed with no
errors.")
- }
- dataProc
+ Assert.usage(!isError)
+ if (xpath != "/") root.notYetImplemented("""Path must be "/". Other path
support is not yet implemented.""")
+ val rootERD = root.elementRuntimeData
+ root.schemaDefinitionUnless(
+ rootERD.outputValueCalcExpr.isEmpty,
+ "The root element cannot have the dfdl:outputValueCalc property.")
+ val validationMode = ValidationMode.Off
+ val p = if (!root.isError) parser else null
+ val u = if (!root.isError) unparser else null
+ val ssrd = new SchemaSetRuntimeData(
+ p,
+ u,
+ this.diagnostics,
+ rootERD,
+ variableMap,
+ typeCalcMap)
+ if (root.numComponents > root.numUniqueComponents)
+ log(LogLevel.Info, "Compiler: component counts: unique %s, actual %s.",
+ root.numUniqueComponents, root.numComponents)
+ val dataProc = new DataProcessor(ssrd, tunable,
self.compilerExternalVarSettings)
+ //
+ // now we fake serialize to a dev/null-type output stream which forces
+ // any lazy evaluation that hasn't completed to complete.
+ // Those things could signal errors, so we do this before we check for
errors.
+ //
+ // Note that calling preSerialization is not sufficient, since that's only
mixed into
+ // objects with lazy evaluation. A SSRD is just a tuple-like object, does
not mixin
+ // preSerialization, and shouldn't need to. We need to
+ // serialize all its substructure to insure all preSerializations, that
force
+ // all lazy evaluations, are done.
+ //
+ // Overhead-wise, this is costly, if the caller is about to save the
processor themselves
+ // But as there have been cases of Runtime1 processors which end up doing
lazy evaluation
+ // that ends up happening late, this eliminates a source of bugs, albeit,
by masking them
+ // so they are not detectable.
Review comment:
Agreed with this comment. And even if a user doesn't call save(),
there's still going to be some overhead to serialize the objects, even if they
aren't written.
Can we create a bug so we remember to fix this and remove this
serialization? Feels like something we might want to have fixed before the next
release.
##########
File path:
daffodil-cli/src/it/scala/org/apache/daffodil/debugger/TestCLIDebugger.scala
##########
@@ -1265,20 +1265,29 @@ class TestCLIdebugger {
shell.sendLine("step")
shell.sendLine("step")
shell.sendLine("step")
- shell.sendLine("step")
- shell.sendLine("step")
+ //
+ // NOTE FOR REVIEWER
+ //
+ // This test had to be modified to work with this change set.
+ // I have no idea why this change set would affect behavior
+ // of this kind.
+ //
Review comment:
Maybe related to the serilazation stuff? Maybe forcing serialization is
cause different code paths to be triggers? Or maybe thing are getting evaluated
in different orders, which somehow changes thigns? Doesn't seem likely, but the
only thing I can think of. Does this behavior switch back if we remove your
serialization changes?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]