mbeckerle commented on a change in pull request #651:
URL: https://github.com/apache/daffodil/pull/651#discussion_r722286469
##########
File path:
daffodil-core/src/main/scala/org/apache/daffodil/dsom/RawCommonRuntimeValuedPropertiesMixin.scala
##########
@@ -64,8 +64,6 @@ trait RawSequenceRuntimeValuedPropertiesMixin
trait RawLayeringRuntimeValuedPropertiesMixin
extends PropertyMixin {
- protected final lazy val optionLayerTransformRaw =
findPropertyOption("layerTransform", expressionAllowed = true)
- protected final lazy val layerTransformRaw =
requireProperty(optionLayerTransformRaw)
protected final lazy val optionLayerEncodingRaw =
findPropertyOption("layerEncoding", expressionAllowed = true)
Review comment:
The general principle for DFDL was all properties that are needed are
required to be expressed. However, now that DFDL v1.0 is finished, new
properties added would need a control over when to force that behavior, and
when to tolerate them being missing.
The dfdlx:layerEncoding property is only required if the layer transform
needs an encoding, and it furthermore only needs it if the xs:sequence, on
which the layer is expressed either doesn't have the property at all (perhaps
because it doesn't need it), or the layer transform requires a different
encoding property than the xs:sequence.
I expect a primary use case to be a textual data document in UTF-8,
containing say, a base64-encoded gzipped region, which we know can only contain
ascii characters. The dfdlx:layerEncoding can specify ascii, which more
accurately reflects the format constraint on that region. That's the notion
anyway.
##########
File path:
daffodil-core/src/main/scala/org/apache/daffodil/layers/LayerCompilerInfoImpl.scala
##########
@@ -0,0 +1,92 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.daffodil.layers
+
+import org.apache.daffodil.api.WarnID
+import org.apache.daffodil.dsom.SequenceTermBase
+import org.apache.daffodil.processors.LayerBoundaryMarkEv
+import org.apache.daffodil.processors.LayerCharsetEv
+import org.apache.daffodil.processors.LayerLengthEv
+import org.apache.daffodil.processors.charset.BitsCharsetJava
+import org.apache.daffodil.processors.charset.BitsCharsetNonByteSize
+import org.apache.daffodil.schema.annotation.props.gen.LayerLengthKind
+import org.apache.daffodil.schema.annotation.props.gen.LayerLengthUnits
+import org.apache.daffodil.util.Maybe
+
+
+/**
+ * Provides access to DFDL schema compile-time information about the layer
properties.
+ *
+ * Allows reporting of schema definition errors and warnings at schema compile
time.
+ */
+final class LayerCompileInfoImpl(sequence: SequenceTermBase,
+ maybeLayerCharsetEv: Maybe[LayerCharsetEv],
+ maybeLayerLengthKind: Maybe[LayerLengthKind],
+ maybeLayerLengthEv: Maybe[LayerLengthEv],
+ maybeLayerLengthUnits: Maybe[LayerLengthUnits],
+ maybeLayerBoundaryMarkEv: Maybe[LayerBoundaryMarkEv])
+extends LayerCompileInfo {
+ /**
+ * If defined, and the value is a compile-time constant then this will be
Some(Some(Charset))
+ * If defined, and the value is non-constant or not a regular JVM charset,
then this will be Some(None)
+ * If undefined, the value is None
+ */
+ override def optLayerJavaCharsetOptConstantValue:
Option[Option[java.nio.charset.Charset]] = {
+ if (maybeLayerCharsetEv.isEmpty) None
+ else
+ maybeLayerCharsetEv.get.optConstant.map {
+ case java: BitsCharsetJava => Some(java.javaCharset)
+ case _: BitsCharsetNonByteSize => None
Review comment:
Yes, we need a test for this. This constraint may need to be lifted
someday, having a test that fails in that case will help.
##########
File path:
daffodil-propgen/src/main/resources/org/apache/daffodil/xsd/DFDL_part1_simpletypes.xsd
##########
@@ -697,13 +697,13 @@
</xsd:union>
</xsd:simpleType>
- <xsd:simpleType name="LayerTransformType_Or_DFDLExpression">
+ <xsd:simpleType name="LayerTransformType">
Review comment:
That can probably be done now yes, since the non dfdlx prefix versions
of these properties are no more.
##########
File path:
daffodil-runtime1/src/main/scala/org/apache/daffodil/layers/LayerCompileInfo.scala
##########
@@ -0,0 +1,49 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.daffodil.layers
+
+import org.apache.daffodil.schema.annotation.props.gen.LayerLengthKind
+import org.apache.daffodil.schema.annotation.props.gen.LayerLengthUnits
+
+
+/**
+ * provides access to DFDL schema compile-time information about the layer
properties
+ */
+trait LayerCompileInfo {
Review comment:
I've so far not really worried about allowing this class of plug-in
extensions "in java" be a goal. I just want them to be pluggable in Scala.
When something is as intimately tied to Daffodil as these layer transformers
are, I don't think Java is particularly important. Certainly not as much as for
UDFs. Are our validators directly pluggable in Java? Or do you have to write
some glue code in Scala? I don't actually even know.
It doesn't bother me to make people learn a little scala to extend Daffodil.
Nudges them along toward becoming contributors :-)
##########
File path:
daffodil-runtime1/src/main/scala/org/apache/daffodil/layers/LayerTransformer.scala
##########
@@ -282,4 +234,234 @@ case class LayerNotEnoughDataException(layerSRD:
SequenceRuntimeData, state: PSt
extends LayerException(layerSRD, state, One(cause), None, nBytesRequired) {
override def isError = true
override def modeName = "Parse"
- }
\ No newline at end of file
+ }
+
+
+
+class LayerRuntimeInfoImpl(val srd: SequenceRuntimeData,
+ maybeLayerCharsetEv: Maybe[LayerCharsetEv],
+ maybeLayerLengthKind: Maybe[LayerLengthKind],
+ maybeLayerLengthEv: Maybe[LayerLengthEv],
+ maybeLayerLengthUnits: Maybe[LayerLengthUnits],
+ maybeLayerBoundaryMarkEv: Maybe[LayerBoundaryMarkEv])
+ extends LayerRuntimeInfo {
+
+ def evaluatables: Seq[Evaluatable[AnyRef]] =
+ maybeLayerCharsetEv.toScalaOption.toSeq ++
+ maybeLayerLengthEv.toScalaOption.toSeq ++
+ maybeLayerBoundaryMarkEv.toScalaOption.toSeq
+
+
+ def layerInfo(state: ParseOrUnparseState): LayerInfo =
+ new LayerInfoImpl(state, srd,
+ maybeLayerCharsetEv,
+ maybeLayerLengthKind,
+ maybeLayerLengthEv,
+ maybeLayerLengthUnits,
+ maybeLayerBoundaryMarkEv)
+}
+
+/**
+ * Allows access to all the layer properties, if defined, including
+ * evaluating expressions if the properties values are defined as expressions.
+ */
+sealed trait LayerInfo {
+ def optLayerCharset: Option[java.nio.charset.Charset]
+ def optLayerLengthKind: Option[LayerLengthKind]
+ def optLayerLength: Option[Long]
+ def optLayerLengthUnits: Option[LayerLengthUnits]
+ def optLayerBoundaryMark: Option[String]
+}
+
+class LayerInfoImpl(state: ParseOrUnparseState,
+ srd: SequenceRuntimeData,
+ maybeLayerCharsetEv: Maybe[LayerCharsetEv],
+ maybeLayerLengthKind: Maybe[LayerLengthKind],
+ maybeLayerLengthEv: Maybe[LayerLengthEv],
+ maybeLayerLengthUnits: Maybe[LayerLengthUnits],
+ maybeLayerBoundaryMarkEv: Maybe[LayerBoundaryMarkEv])
+extends LayerInfo {
+
+ override def optLayerCharset: Option[Charset] =
maybeLayerCharsetEv.toScalaOption.map {
+ _.evaluate(state)
+ } match {
+ case Some(bitsCharsetJava: BitsCharsetJava) =>
Some(bitsCharsetJava.javaCharset)
+ case None => None
+ case Some(other) => None // only java charsets are supported.
+ }
+
+ override def optLayerLengthKind: Option[LayerLengthKind] =
maybeLayerLengthKind.toScalaOption
+
+ override def optLayerLength: Option[Long] =
maybeLayerLengthEv.toScalaOption.map{ _.evaluate(state) }
+
+ override def optLayerLengthUnits: Option[LayerLengthUnits] =
maybeLayerLengthUnits.toScalaOption
+
+ override def optLayerBoundaryMark: Option[String] =
maybeLayerBoundaryMarkEv.toScalaOption.map{ _.evaluate(state) }
+}
+
+
+abstract class ByteBufferExplicitLengthLayerTransform[T](
+ layerRuntimeInfoArg: LayerRuntimeInfo,
+ layerName: String,
+ variablesNamespace: String,
+ variablesPreferredNamespacePrefix: String,
+ localNamesAndTypesOfVariablesToRead: Seq[(String, String)],
+ localNameOfVariableToWrite: String,
+ outputVariableTypeLocalNameAsString: String)
+ extends LayerTransformer(layerName, layerRuntimeInfoArg) {
+
+ /**
+ * Override to specify the length exactly.
+ */
+ protected def explicitLength() = -1L
Review comment:
This is only needed for a layer transform with a built-in constant
length, so that the schema author need not specify dfdlx:layerLengthKind
'explicit' and dfdl:layerLength='N' for the right length N.
If a layer transform does use those things, then they don't need to override
this method, and the -1 is ignored.
But I agree we should just make this pure virtual.
I think I will also rename it to layerBuiltInConstantLengthValue()
##########
File path:
daffodil-test/src/test/scala/org/apache/daffodil/layers/CheckDigit.scala
##########
@@ -17,63 +17,23 @@
package org.apache.daffodil.layers
-import org.apache.commons.io.IOUtils
-import org.apache.daffodil.schema.annotation.props.gen.LayerLengthKind
-import org.apache.daffodil.schema.annotation.props.gen.LayerLengthUnits
-import org.apache.daffodil.util.Maybe
-import org.apache.daffodil.processors.LayerLengthEv
-import org.apache.daffodil.processors.LayerBoundaryMarkEv
-import org.apache.daffodil.processors.LayerCharsetEv
-import org.apache.daffodil.processors.parsers.PState
-import org.apache.daffodil.processors.unparsers.UState
-
-import java.io._
-import org.apache.daffodil.dpath.NodeInfo.PrimType
-import org.apache.daffodil.exceptions.Assert
-import org.apache.daffodil.processors.CharsetEvBase
-import org.apache.daffodil.processors.SequenceRuntimeData
-import org.apache.daffodil.processors.SuspendableOperation
-import org.apache.daffodil.processors.charset.BitsCharsetJava
-import org.apache.daffodil.util.ByteBufferOutputStream
import org.apache.daffodil.util.Logger
-import org.apache.daffodil.xml.NS
-import org.apache.daffodil.xml.RefQName
import java.nio.ByteBuffer
-import java.nio.charset.Charset
-
-final class CheckDigitExplicit(
- srd: SequenceRuntimeData,
- charset: Charset,
- layerLengthEv: LayerLengthEv)
-extends LayerTransformer() {
- Assert.invariant(charset.newDecoder().maxCharsPerByte() == 1) // is a SBCS
charset
-
- private val checkDigitVRD = {
- val varNamespace = NS("urn:org.apache.daffodil.layers.checkDigit")
- val varQName = RefQName(Some("cd"), "checkDigit",
varNamespace).toGlobalQName
- val vrd = srd.variableMap.getVariableRuntimeData(varQName).getOrElse {
- srd.SDE("Variable '%s' is not defined.", varQName.toExtendedSyntax)
- }
- srd.schemaDefinitionUnless(vrd.primType == PrimType.Int,
- "Variable '%s' is not of type 'xs:int'.", varQName.toExtendedSyntax)
- vrd
- }
- private val checkDigitParamsVRD = {
- val varNamespace = NS("urn:org.apache.daffodil.layers.checkDigit")
- val varQName = RefQName(Some("cd"), "checkDigitParams",
varNamespace).toGlobalQName
- val vrd = srd.variableMap.getVariableRuntimeData(varQName).getOrElse {
- srd.SDE("Variable '%s' is not defined.", varQName.toExtendedSyntax)
- }
- srd.schemaDefinitionUnless(vrd.primType == PrimType.String,
- "Variable '%s' is not of type 'xs:string'.", varQName.toExtendedSyntax)
- vrd
- }
+final class CheckDigitExplicit(name: String,
+ infoArg: LayerRuntimeInfo)
+extends ByteBufferExplicitLengthLayerTransform[Int](
+ infoArg,
+ name,
+ variablesNamespace = "urn:org.apache.daffodil.layers.checkDigit",
+ variablesPreferredNamespacePrefix = "cd",
+ localNamesAndTypesOfVariablesToRead = Seq(("checkDigitParams", "string")),
Review comment:
I will investigate that, vs. this notion of using the string name of a
primitive type.
##########
File path:
daffodil-runtime1/src/main/scala/org/apache/daffodil/layers/LayerTransformer.scala
##########
@@ -123,19 +67,27 @@ object LayerTransformerFactory {
* A layer transformer is created at runtime as part of a single parse/unparse
call.
* Hence, they can be stateful without causing thread-safety issues.
*/
-abstract class LayerTransformer() {
+abstract class LayerTransformer(layerName: String, layerRuntimeInfoArg:
LayerRuntimeInfo) {
+
+ final def layerRuntimeInfo =
layerRuntimeInfoArg.asInstanceOf[LayerRuntimeInfoImpl]
Review comment:
This is the tricky design point I think. Or "a" tricky design point. The
LayerCompiler is definitely user-created. The LayerTransformer is user-created.
The LayerTrasnformerFactory is pretty much boilerplate everywhere I think.
This 3-level cake of LayerCompiler with LayerCompileInfo,
LayerTransformerFactory with LayerRuntimeInfo, and LayerTransformer, some of
the methods of which are called with LayerInfo, well it works, and arranges for
the serialization to work, and not to expose things in the runtime we don't
want to expose, but that middle step does bother me a bit.
##########
File path:
daffodil-runtime1-unparser/src/main/scala/org/apache/daffodil/processors/unparsers/LayeredSequenceUnparser.scala
##########
@@ -17,20 +17,22 @@
package org.apache.daffodil.processors.unparsers
-import org.apache.daffodil.processors.LayerTransformerEv
+import org.apache.daffodil.layers.LayerRuntimeInfoImpl
+import org.apache.daffodil.layers.LayerTransformerFactory
import org.apache.daffodil.processors.SequenceRuntimeData
class LayeredSequenceUnparser(ctxt: SequenceRuntimeData,
- layerTransformerEv: LayerTransformerEv,
+ layerTransformerFactory: LayerTransformerFactory,
childUnparser: SequenceChildUnparser)
extends OrderedUnseparatedSequenceUnparser(ctxt, Seq(childUnparser)) {
- override lazy val runtimeDependencies = Vector(layerTransformerEv)
+ override lazy val runtimeDependencies =
+
layerTransformerFactory.layerRuntimeInfo.asInstanceOf[LayerRuntimeInfoImpl].evaluatables.toVector
Review comment:
Yup. You've definitely focused in on the architectural issue with this
design point.
On next revision I'll see if I can fix this.
##########
File path:
daffodil-runtime1/src/main/scala/org/apache/daffodil/layers/LayerCompileInfo.scala
##########
@@ -0,0 +1,49 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.daffodil.layers
+
+import org.apache.daffodil.schema.annotation.props.gen.LayerLengthKind
+import org.apache.daffodil.schema.annotation.props.gen.LayerLengthUnits
+
+
+/**
+ * provides access to DFDL schema compile-time information about the layer
properties
+ */
+trait LayerCompileInfo {
+ /**
+ * If defined, and the value is a compile-time constant then this will be
Some(Some(Charset))
+ * If defined, and the value is non-constant then this will be Some(None)
+ * If undefined, the value is None
+ */
+ def optLayerJavaCharsetOptConstantValue:
Option[Option[java.nio.charset.Charset]]
+ def optLayerLengthKind: Option[LayerLengthKind]
+ def optLayerLengthOptConstantValue: Option[Option[Long]]
+ def optLayerLengthUnits: Option[LayerLengthUnits]
+ def optLayerBoundaryMarkOptConstantValue: Option[Option[String]]
+ def schemaDefinitionError(message: String, args: Any*): Nothing
+ def schemaDefinitionWarning(message: String, args: Any*): Unit
+ final def schemaDefinitionUnless(test: Boolean, message: String, args: Any*)
= if (!test) SDE(message, args: _*)
Review comment:
I agree with just removing it.
##########
File path:
daffodil-core/src/main/scala/org/apache/daffodil/layers/LayerCompilerRegistry.scala
##########
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.daffodil.layers
+
+
+import org.apache.daffodil.exceptions.ThrowsSDE
+
+import scala.collection.mutable
+
+/**
+ * Transformers have factories. This lets you find the transformer factory
+ * by the name obtained from dfdlx:layerTransform.
+ */
+object LayerCompilerRegistry {
+
+ private lazy val layerCompilerMap = new mutable.HashMap[String,
LayerCompiler]
+
+ def register(layerCompiler: LayerCompiler): Unit = {
+ layerCompilerMap.put(layerCompiler.name, layerCompiler)
+ }
+ /**
+ * Given name, finds the factory for the transformer. SDE otherwise.
+ *
+ * The state is passed in order to provide diagnostic context if not found.
+ */
+ def find(name: String, context: ThrowsSDE): LayerCompiler = {
+ val maybeCompiler: Option[LayerCompiler] = layerCompilerMap.get(name)
+ if (maybeCompiler.isEmpty) {
+ //
+ // TODO: Daffodil-1927
+ // This is where we would try to dynamically load a layer transform
compiler
+ // for the given layer name.
+ //
+ val choices = layerCompilerMap.keySet.mkString(", ")
+ context.SDE("The dfdlx:layerTransform '%s' was not found. Available
choices are: %s", name, choices)
+ } else {
+ maybeCompiler.get
+ }
+ }
Review comment:
That's the idea yes. I wanted to first address the issue of Daffodil
internals leaking out of a porous API.
##########
File path:
daffodil-runtime1/src/main/scala/org/apache/daffodil/layers/LayerTransformer.scala
##########
@@ -282,4 +234,234 @@ case class LayerNotEnoughDataException(layerSRD:
SequenceRuntimeData, state: PSt
extends LayerException(layerSRD, state, One(cause), None, nBytesRequired) {
override def isError = true
override def modeName = "Parse"
- }
\ No newline at end of file
+ }
+
+
+
+class LayerRuntimeInfoImpl(val srd: SequenceRuntimeData,
+ maybeLayerCharsetEv: Maybe[LayerCharsetEv],
+ maybeLayerLengthKind: Maybe[LayerLengthKind],
+ maybeLayerLengthEv: Maybe[LayerLengthEv],
+ maybeLayerLengthUnits: Maybe[LayerLengthUnits],
+ maybeLayerBoundaryMarkEv: Maybe[LayerBoundaryMarkEv])
+ extends LayerRuntimeInfo {
+
+ def evaluatables: Seq[Evaluatable[AnyRef]] =
+ maybeLayerCharsetEv.toScalaOption.toSeq ++
+ maybeLayerLengthEv.toScalaOption.toSeq ++
+ maybeLayerBoundaryMarkEv.toScalaOption.toSeq
+
+
+ def layerInfo(state: ParseOrUnparseState): LayerInfo =
+ new LayerInfoImpl(state, srd,
+ maybeLayerCharsetEv,
+ maybeLayerLengthKind,
+ maybeLayerLengthEv,
+ maybeLayerLengthUnits,
+ maybeLayerBoundaryMarkEv)
+}
+
+/**
+ * Allows access to all the layer properties, if defined, including
+ * evaluating expressions if the properties values are defined as expressions.
+ */
+sealed trait LayerInfo {
Review comment:
So, LayerCompileInfo is everything needed to compile the layer - and
issue errors at compile time.
LayerRuntimeInfo isn't everything the layer needs. It is just a
serialization of the layer-oriented property values that were defined for that
layer. These are already known to be correct and adequate based on
LayerCompileInfo.
Anything else the layer needs, by way of parameters, has to be supplied via
DFDL variables at runtime.
The reason we need LayerIinfo, is that there are things that cannot be
computed without a PState/UState because some properties can be expressions.
LayerInfo contains the result of evaluating those evaluatables given the
PState/UState passed at runtime to the transformation methods. This includes
the layer length (from LayerLengthEv), the LayerBoundaryMarkEv, and I think the
layer encoding as well.
But you make a good point which is that we are already doing the separate
compile step now, so perhaps the LayerTransformerFactory becomes internal, and
the parser/unparser creates the LayerTransformer instance the same way every
time, and does so only once we have a PState/UState, at real parse/unparse time.
I will look into this, as the 3 level cake here does seem like 1 too many.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]