mbeckerle commented on code in PR #132:
URL: https://github.com/apache/daffodil-site/pull/132#discussion_r1612207243
##########
site/layers.md:
##########
@@ -0,0 +1,382 @@
+---
+layout: page
+title: Layers - Pluggable Extensions to Enable Algorithmic Transformations in
DFDL
+description: Pluggable Extensions to Enable Algorithmic Transformations in DFDL
+group: nav-right
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements. See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License. You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+## Introduction
+
+This page describes a DFDL language extension known as _Layers_.
+
+A _layer_ is an algorithmic transformation of the data stream that cannot be
expressed using
+regular DFDL properties.
+When parsing it is like a pre-processing of the data stream which
+happens before parsing.
+When unparsing it is like a post-processing of the data stream which
+happens after unparsing.
+The layer can underlie a part of the data stream, or all of it.
+
+### Built-in Layers
+
+Daffodil includes several built-in layers:
+- base64_MIME
+- fourbyteswap
+- twobyteswap
+- gzip
+- lineFolded_IMF
+- lineFolded_iCalendar
+
+Daffodil also includes two utility layers that are used in combination with
other layers to
+isolate the subset of the data stream the layer algorithm will operate upon.
+These are:
+- boundaryMark
+- fixedLength
+
+Each of the built-in layers will be documented separately below with examples
of their usage.
+
+### Custom Plug-In Layers
+
+Additional layers can be written in Java or Scala and deployed as _plug-ins_
for Daffodil.
+These are generally packaged as DFDL _layer schemas_, a kind of _component
schema_,
+that provide the layer packaged for import by other DFDL _assembly_ schemas
that use the
+layer in the data format they describe.
+
+## Transforming Layers and Checksum Layers
+
+There are two different kinds of layers, though they share many
characteristics. They are
+_transforming_ layers, and _checksum_ layers. Both run small algorithms over
part (or all) of
+the data stream. The difference is the purpose of the algorithm and its
output.
+
+#### Transforming Layers
+
+These layers decode data (when parsing), and encode data (when unparsing).
+The simplest example of a transforming layer is the `base64_MIME` layer which
+decodes the well known base64 encoding which is commonly used to encode binary
+data inside textual data formats.
+
+Besides decoding and encoding the data stream for the parser/unparser,
transforming layers can be
+parameterized using DFDL variables.
+They can also assign computed result values to DFDL variables, though this is
uncommon.
+
+Custom transforming layers are created by deriving an implementation from the
Daffodil API's
+[`Layer`](../docs/latest/javadoc/org/apache/daffodil/runtime1/layers/api/Layer.html)
class
+which is introduced in a later section.
+
+#### Checksum Layers
+
+Checksum layers are a simplified kind of layers which do not decode or encode
data, they simply
+pass-through the data unmodified, but while doing so they compute a checksum,
hash, or Cyclic Redundancy Check (CRC) over the data stream.
+The value of the checksum (or hash or CRC) is assigned to a DFDL variable as
the result of the
+layer. This makes the value available for use by the DFDL schema that uses the
checksum layer.
+When parsing, the value of this DFDL variable can then be compared to a
checksum field in the
+data, and either an invalid data element or an parse-error can be created if
the checksum in the
+data stream does not match the computed value.
+When unparsing, the value of this DFDL variable can be written to an element
using the
+`dfdl:outputValueCalc` property.
+
+An example of a layer plug-in is in the <a
href="https://github.com/DFDLSchemas/ethernetIP">EthernetIP</a>
+DFDL schema, which uses a Daffodil layer to describe the IPv4 packet header
checksum algorithm.
+
+Custom checksum layers are created by deriving an implementation class from
the Daffodil API's
+[`ChecksumLayer`](../docs/latest/javadoc/org/apache/daffodil/runtime1/layers/api/ChecksumLayer.
+html)
+class, which is introduced in a later section.
+
+## Layer Composition
+
+Layers can be piled on top of each other. Once a layer is in place, another
layer can be used in
+conjunction with it.
+In the section on [Using Layers](#UsingLayers) below we will look at an
example that uses both the
+built-in _gzip_ layer, and the _base64_MIME_ layer together.
+
+# Using Layers
+
+To use a layer you must know
+- the layer's namespace URI
+- the layer's name
+- the names of any layer parameter variables
+- the names of any layer result variables
+
+## Example: Line Folding
+
+As a first example, let's look at the line folding layer, specifically the
`lineFolded_IMF`layer,
+which is built-in to Daffodil.
+
+Line folding is a way of encoding textual data formats so that no line of text
is longer than
+a limited line length.
+
+Consider this data :
+```
+Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
+ tempor incididunt ut labore et dolore magna aliqua. Ut enim ad Lorem
+ ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
+ tempor incididunt ut labore et dolore magna aliqua. Ut enim ad
+```
+This data has been _line folded_ at roughly 72 characters by inserting a CRLF
+before an existing space in the data.
+Each line ends with a CRLF (\r\n) and the second through fourth lines begin
+with a space as a way of indicating that they are extension lines.
+This data is supposed to be reassembled to form a long single-line string by
removing
+all CRLF pairs.
+
+The result should be this single longer string which does not contain any line
endings:
+```
+Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua. Ut enim ad Lorem ipsum dolor sit
amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore
et dolore magna aliqua. Ut enim ad
+```
+To achieve this we would use the `lineFolded_IMF` layer.
+This layer has a specific namespace which our DFDL schema will define a prefix
`lf` for like this:
+```xml
+xmlns:lf="urn:org.apache.daffodil.layers.lineFolded"
+```
+Our DFDL schema will import the layer schema for the line-folded layer with
this import:
+```xml
+<import namespace="urn:org.apache.daffodil.layers.lineFolded"
+
schemaLocation="/org/apache/daffodil/layers/xsd/lineFoldedLayer.dfdl.xsd"/>
+```
+Then the layer is incorporated into our DFDL schema like this:
+```xml
+ <sequence dfdlx:layer="lf:lineFolded_IMF">
+ ... elements to be parsed from unfolded layer data go here ...
+ </sequence>
+```
+You can see that use of a layer is described using the `dfdx:layer` property,
and the specific layer
+is identified by a QName using the previously defined namespace prefix.
+The scope of the layer is the duration of the sequence it appears on.
+The `dfdlx:layer` property can only be used on an XSD `sequence`.
+
+The `lineFolded_IMF` layer does not define any DFDL variables in its namespace
as it has no
+parameters and produces no results.
+In fact, as of Daffodil 3.8.0, the `import` statement above is optional for the
+line-folded layers as the DFDL schema `lineFoldedLayer.dfdl.xsd` does not
contain any definitions.
+In the future however, parameters may be added, so for uniformity all layers
define a DFDL
+schema to be imported as part of using the layer.
+
+More detailed documentation for the [Line Folded Layers](#LineFoldedLayers) is
below.
+
+## Example: Base64, GZip, and BoundaryMark
+
+[EXAMPLE TBD] - should explain elements of specified length, or use of
boundaryMark or
+fixed length layers.
+
+# Defining Custom Plug-In Layers
+
+Plug-in custom layers are dynamically loaded using the Java Service Provider
Interface (SPI).
+Hence, they must be compiled into Jar files with specific META-INF metadata,
and must appear on
+the Java CLASSPATH so that they can be found and loaded.
+
+The layer API is defined via Java clases and interfaces to enable writing of
custom layers in
+either Java or Scala.
+(One of the built-in layers (Gzip) is written in Java now, by way of proving
+that one can write a Layer in Java.)
+
+A layer implementation class must obey various naming
+conventions that allow Java/Scala reflection to associate the names of
Java/Scala code methods and
+method arguments with DFDL variables in that layer's namespace, that share
those names.
+
+Transformer layer classes are derived from the
+[`Layer`](../docs/latest/javadoc/org/apache/daffodil/runtime1/layers/api/Layer.html)
+base class.
+
+A name and target namespace, and hence a QName is required to identify a layer
for use from
+a DFDL schema.
+This namespace is OWNED by the layer.
+All DFDL variables
+defined in that namespace are either used to pass parameters to the
+layer code, or receive results (such as a checksum) back from the
+layer code.
+This is enforced.
+A layer that has no DFDL variables does
+not have to define a DFDL schema that defines the layer's target
+namespace, but any layer that uses DFDL variables *must* define a
+schema with the layer's namespace as its target namespace, with the
+variables declared in it (using `dfdl:defineVariable`).
+
+There is also an abstract base class for defining checksum layers
+called,
+
+ org.apache.daffodil.runtime1.layers.api.ChecksumLayer
+
+### Layer Limiting
+
+A layer can process an entire input file/stream.
+Such a layer is said to be _unlimited_.
+More commonly, a layer has a limited region within the data stream that it is
supposed to process.
+The restriction of the layer so that it only processes the expected part of
the data is called
+_layer limiting_.
+
+There are two ways to do layer limiting, that differ in an
+important way.
+
+#### Layer Limiting using Elements of Specified Length
+
+If the `sequence` for a layer is the model group of the
+complex type of an element, and that element has `dfdl:lengthKind` of
`'explicit'`, `'implicit'`,
+or `'pattern'`, then the length of the element limits the length of the layer
within it.
+
+In this case the length of the layer is limited when parsing, but it is NOT
limited when
+unparsing.
+
+#### Layer Limiting using the `fixedLength` Utility Layer
+
+If you use the layer `fixedLength`, that dictates the parse and
+unparse length to be the value of the `layerLength` DFDL variable.
+
+Similarly, if you use a `checksum` layer built with the Daffodil API
`ChecksumLayer` base class,
+the length is controlled for both parsing and unparsing as it works
+similarly to the way the `fixedLength` layer works.
+
+#### Layer Limiting using the `boundaryMark` Utility Layer
+
+The `boundaryMark` layer uses a variable named `boundaryMark` which provides a
parameter to the
+layer which is a delimiter of the layer data when parsing, and
+which is inserted after the (otherwise unbounded) layer data when unparsing.
+
+### Layer Variables:
Review Comment:
This section to the javadoc. Down to the section on Built-in layers.
##########
site/layers.md:
##########
@@ -0,0 +1,382 @@
+---
+layout: page
+title: Layers - Pluggable Extensions to Enable Algorithmic Transformations in
DFDL
+description: Pluggable Extensions to Enable Algorithmic Transformations in DFDL
+group: nav-right
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements. See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License. You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+## Introduction
+
+This page describes a DFDL language extension known as _Layers_.
+
+A _layer_ is an algorithmic transformation of the data stream that cannot be
expressed using
+regular DFDL properties.
+When parsing it is like a pre-processing of the data stream which
+happens before parsing.
+When unparsing it is like a post-processing of the data stream which
+happens after unparsing.
+The layer can underlie a part of the data stream, or all of it.
+
+### Built-in Layers
+
+Daffodil includes several built-in layers:
+- base64_MIME
+- fourbyteswap
+- twobyteswap
+- gzip
+- lineFolded_IMF
+- lineFolded_iCalendar
+
+Daffodil also includes two utility layers that are used in combination with
other layers to
+isolate the subset of the data stream the layer algorithm will operate upon.
+These are:
+- boundaryMark
+- fixedLength
+
+Each of the built-in layers will be documented separately below with examples
of their usage.
+
+### Custom Plug-In Layers
+
+Additional layers can be written in Java or Scala and deployed as _plug-ins_
for Daffodil.
+These are generally packaged as DFDL _layer schemas_, a kind of _component
schema_,
+that provide the layer packaged for import by other DFDL _assembly_ schemas
that use the
+layer in the data format they describe.
+
+## Transforming Layers and Checksum Layers
+
+There are two different kinds of layers, though they share many
characteristics. They are
+_transforming_ layers, and _checksum_ layers. Both run small algorithms over
part (or all) of
+the data stream. The difference is the purpose of the algorithm and its
output.
+
+#### Transforming Layers
+
+These layers decode data (when parsing), and encode data (when unparsing).
+The simplest example of a transforming layer is the `base64_MIME` layer which
+decodes the well known base64 encoding which is commonly used to encode binary
+data inside textual data formats.
+
+Besides decoding and encoding the data stream for the parser/unparser,
transforming layers can be
+parameterized using DFDL variables.
+They can also assign computed result values to DFDL variables, though this is
uncommon.
+
+Custom transforming layers are created by deriving an implementation from the
Daffodil API's
+[`Layer`](../docs/latest/javadoc/org/apache/daffodil/runtime1/layers/api/Layer.html)
class
+which is introduced in a later section.
+
+#### Checksum Layers
+
+Checksum layers are a simplified kind of layers which do not decode or encode
data, they simply
+pass-through the data unmodified, but while doing so they compute a checksum,
hash, or Cyclic Redundancy Check (CRC) over the data stream.
+The value of the checksum (or hash or CRC) is assigned to a DFDL variable as
the result of the
+layer. This makes the value available for use by the DFDL schema that uses the
checksum layer.
+When parsing, the value of this DFDL variable can then be compared to a
checksum field in the
+data, and either an invalid data element or an parse-error can be created if
the checksum in the
+data stream does not match the computed value.
+When unparsing, the value of this DFDL variable can be written to an element
using the
+`dfdl:outputValueCalc` property.
+
+An example of a layer plug-in is in the <a
href="https://github.com/DFDLSchemas/ethernetIP">EthernetIP</a>
+DFDL schema, which uses a Daffodil layer to describe the IPv4 packet header
checksum algorithm.
+
+Custom checksum layers are created by deriving an implementation class from
the Daffodil API's
+[`ChecksumLayer`](../docs/latest/javadoc/org/apache/daffodil/runtime1/layers/api/ChecksumLayer.
+html)
+class, which is introduced in a later section.
+
+## Layer Composition
+
+Layers can be piled on top of each other. Once a layer is in place, another
layer can be used in
+conjunction with it.
+In the section on [Using Layers](#UsingLayers) below we will look at an
example that uses both the
+built-in _gzip_ layer, and the _base64_MIME_ layer together.
+
+# Using Layers
+
+To use a layer you must know
+- the layer's namespace URI
+- the layer's name
+- the names of any layer parameter variables
+- the names of any layer result variables
+
+## Example: Line Folding
+
+As a first example, let's look at the line folding layer, specifically the
`lineFolded_IMF`layer,
+which is built-in to Daffodil.
+
+Line folding is a way of encoding textual data formats so that no line of text
is longer than
+a limited line length.
+
+Consider this data :
+```
+Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
+ tempor incididunt ut labore et dolore magna aliqua. Ut enim ad Lorem
+ ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
+ tempor incididunt ut labore et dolore magna aliqua. Ut enim ad
+```
+This data has been _line folded_ at roughly 72 characters by inserting a CRLF
+before an existing space in the data.
+Each line ends with a CRLF (\r\n) and the second through fourth lines begin
+with a space as a way of indicating that they are extension lines.
+This data is supposed to be reassembled to form a long single-line string by
removing
+all CRLF pairs.
+
+The result should be this single longer string which does not contain any line
endings:
+```
+Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua. Ut enim ad Lorem ipsum dolor sit
amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore
et dolore magna aliqua. Ut enim ad
+```
+To achieve this we would use the `lineFolded_IMF` layer.
+This layer has a specific namespace which our DFDL schema will define a prefix
`lf` for like this:
+```xml
+xmlns:lf="urn:org.apache.daffodil.layers.lineFolded"
+```
+Our DFDL schema will import the layer schema for the line-folded layer with
this import:
+```xml
+<import namespace="urn:org.apache.daffodil.layers.lineFolded"
+
schemaLocation="/org/apache/daffodil/layers/xsd/lineFoldedLayer.dfdl.xsd"/>
+```
+Then the layer is incorporated into our DFDL schema like this:
+```xml
+ <sequence dfdlx:layer="lf:lineFolded_IMF">
+ ... elements to be parsed from unfolded layer data go here ...
+ </sequence>
+```
+You can see that use of a layer is described using the `dfdx:layer` property,
and the specific layer
+is identified by a QName using the previously defined namespace prefix.
+The scope of the layer is the duration of the sequence it appears on.
+The `dfdlx:layer` property can only be used on an XSD `sequence`.
+
+The `lineFolded_IMF` layer does not define any DFDL variables in its namespace
as it has no
+parameters and produces no results.
+In fact, as of Daffodil 3.8.0, the `import` statement above is optional for the
+line-folded layers as the DFDL schema `lineFoldedLayer.dfdl.xsd` does not
contain any definitions.
+In the future however, parameters may be added, so for uniformity all layers
define a DFDL
+schema to be imported as part of using the layer.
+
+More detailed documentation for the [Line Folded Layers](#LineFoldedLayers) is
below.
+
+## Example: Base64, GZip, and BoundaryMark
+
+[EXAMPLE TBD] - should explain elements of specified length, or use of
boundaryMark or
+fixed length layers.
+
+# Defining Custom Plug-In Layers
+
+Plug-in custom layers are dynamically loaded using the Java Service Provider
Interface (SPI).
+Hence, they must be compiled into Jar files with specific META-INF metadata,
and must appear on
+the Java CLASSPATH so that they can be found and loaded.
+
+The layer API is defined via Java clases and interfaces to enable writing of
custom layers in
+either Java or Scala.
+(One of the built-in layers (Gzip) is written in Java now, by way of proving
+that one can write a Layer in Java.)
+
+A layer implementation class must obey various naming
Review Comment:
Move most details about writing custom layers to the javadoc. This section
should be about USING custom layers, and how it differs from using Daffodil
built-in layers.
For guidance on coding custom layers, just link to the javadoc. All of that
should be in the javadoc for the layers.api package, and for the Layer and
ChecksumLayer classes.
##########
site/layers.md:
##########
@@ -0,0 +1,382 @@
+---
+layout: page
+title: Layers - Pluggable Extensions to Enable Algorithmic Transformations in
DFDL
+description: Pluggable Extensions to Enable Algorithmic Transformations in DFDL
+group: nav-right
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements. See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License. You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+## Introduction
+
+This page describes a DFDL language extension known as _Layers_.
+
+A _layer_ is an algorithmic transformation of the data stream that cannot be
expressed using
+regular DFDL properties.
+When parsing it is like a pre-processing of the data stream which
+happens before parsing.
+When unparsing it is like a post-processing of the data stream which
+happens after unparsing.
+The layer can underlie a part of the data stream, or all of it.
+
+### Built-in Layers
+
+Daffodil includes several built-in layers:
+- base64_MIME
+- fourbyteswap
+- twobyteswap
+- gzip
+- lineFolded_IMF
+- lineFolded_iCalendar
+
+Daffodil also includes two utility layers that are used in combination with
other layers to
+isolate the subset of the data stream the layer algorithm will operate upon.
+These are:
+- boundaryMark
+- fixedLength
+
+Each of the built-in layers will be documented separately below with examples
of their usage.
+
+### Custom Plug-In Layers
+
+Additional layers can be written in Java or Scala and deployed as _plug-ins_
for Daffodil.
+These are generally packaged as DFDL _layer schemas_, a kind of _component
schema_,
+that provide the layer packaged for import by other DFDL _assembly_ schemas
that use the
+layer in the data format they describe.
+
+## Transforming Layers and Checksum Layers
+
+There are two different kinds of layers, though they share many
characteristics. They are
+_transforming_ layers, and _checksum_ layers. Both run small algorithms over
part (or all) of
+the data stream. The difference is the purpose of the algorithm and its
output.
+
+#### Transforming Layers
+
+These layers decode data (when parsing), and encode data (when unparsing).
+The simplest example of a transforming layer is the `base64_MIME` layer which
+decodes the well known base64 encoding which is commonly used to encode binary
+data inside textual data formats.
+
+Besides decoding and encoding the data stream for the parser/unparser,
transforming layers can be
+parameterized using DFDL variables.
+They can also assign computed result values to DFDL variables, though this is
uncommon.
+
+Custom transforming layers are created by deriving an implementation from the
Daffodil API's
+[`Layer`](../docs/latest/javadoc/org/apache/daffodil/runtime1/layers/api/Layer.html)
class
+which is introduced in a later section.
+
+#### Checksum Layers
+
+Checksum layers are a simplified kind of layers which do not decode or encode
data, they simply
+pass-through the data unmodified, but while doing so they compute a checksum,
hash, or Cyclic Redundancy Check (CRC) over the data stream.
+The value of the checksum (or hash or CRC) is assigned to a DFDL variable as
the result of the
+layer. This makes the value available for use by the DFDL schema that uses the
checksum layer.
+When parsing, the value of this DFDL variable can then be compared to a
checksum field in the
+data, and either an invalid data element or an parse-error can be created if
the checksum in the
+data stream does not match the computed value.
+When unparsing, the value of this DFDL variable can be written to an element
using the
+`dfdl:outputValueCalc` property.
+
+An example of a layer plug-in is in the <a
href="https://github.com/DFDLSchemas/ethernetIP">EthernetIP</a>
+DFDL schema, which uses a Daffodil layer to describe the IPv4 packet header
checksum algorithm.
+
+Custom checksum layers are created by deriving an implementation class from
the Daffodil API's
+[`ChecksumLayer`](../docs/latest/javadoc/org/apache/daffodil/runtime1/layers/api/ChecksumLayer.
+html)
+class, which is introduced in a later section.
+
+## Layer Composition
+
+Layers can be piled on top of each other. Once a layer is in place, another
layer can be used in
+conjunction with it.
+In the section on [Using Layers](#UsingLayers) below we will look at an
example that uses both the
+built-in _gzip_ layer, and the _base64_MIME_ layer together.
+
+# Using Layers
+
+To use a layer you must know
+- the layer's namespace URI
+- the layer's name
+- the names of any layer parameter variables
+- the names of any layer result variables
+
+## Example: Line Folding
+
+As a first example, let's look at the line folding layer, specifically the
`lineFolded_IMF`layer,
+which is built-in to Daffodil.
+
+Line folding is a way of encoding textual data formats so that no line of text
is longer than
+a limited line length.
+
+Consider this data :
+```
+Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
+ tempor incididunt ut labore et dolore magna aliqua. Ut enim ad Lorem
+ ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
+ tempor incididunt ut labore et dolore magna aliqua. Ut enim ad
+```
+This data has been _line folded_ at roughly 72 characters by inserting a CRLF
+before an existing space in the data.
+Each line ends with a CRLF (\r\n) and the second through fourth lines begin
+with a space as a way of indicating that they are extension lines.
+This data is supposed to be reassembled to form a long single-line string by
removing
+all CRLF pairs.
+
+The result should be this single longer string which does not contain any line
endings:
+```
+Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua. Ut enim ad Lorem ipsum dolor sit
amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore
et dolore magna aliqua. Ut enim ad
+```
+To achieve this we would use the `lineFolded_IMF` layer.
+This layer has a specific namespace which our DFDL schema will define a prefix
`lf` for like this:
+```xml
+xmlns:lf="urn:org.apache.daffodil.layers.lineFolded"
+```
+Our DFDL schema will import the layer schema for the line-folded layer with
this import:
+```xml
+<import namespace="urn:org.apache.daffodil.layers.lineFolded"
+
schemaLocation="/org/apache/daffodil/layers/xsd/lineFoldedLayer.dfdl.xsd"/>
+```
+Then the layer is incorporated into our DFDL schema like this:
+```xml
+ <sequence dfdlx:layer="lf:lineFolded_IMF">
+ ... elements to be parsed from unfolded layer data go here ...
+ </sequence>
+```
+You can see that use of a layer is described using the `dfdx:layer` property,
and the specific layer
+is identified by a QName using the previously defined namespace prefix.
+The scope of the layer is the duration of the sequence it appears on.
+The `dfdlx:layer` property can only be used on an XSD `sequence`.
+
+The `lineFolded_IMF` layer does not define any DFDL variables in its namespace
as it has no
+parameters and produces no results.
+In fact, as of Daffodil 3.8.0, the `import` statement above is optional for the
+line-folded layers as the DFDL schema `lineFoldedLayer.dfdl.xsd` does not
contain any definitions.
+In the future however, parameters may be added, so for uniformity all layers
define a DFDL
+schema to be imported as part of using the layer.
+
+More detailed documentation for the [Line Folded Layers](#LineFoldedLayers) is
below.
+
+## Example: Base64, GZip, and BoundaryMark
+
+[EXAMPLE TBD] - should explain elements of specified length, or use of
boundaryMark or
+fixed length layers.
+
+# Defining Custom Plug-In Layers
+
+Plug-in custom layers are dynamically loaded using the Java Service Provider
Interface (SPI).
+Hence, they must be compiled into Jar files with specific META-INF metadata,
and must appear on
+the Java CLASSPATH so that they can be found and loaded.
+
+The layer API is defined via Java clases and interfaces to enable writing of
custom layers in
+either Java or Scala.
+(One of the built-in layers (Gzip) is written in Java now, by way of proving
+that one can write a Layer in Java.)
+
+A layer implementation class must obey various naming
+conventions that allow Java/Scala reflection to associate the names of
Java/Scala code methods and
+method arguments with DFDL variables in that layer's namespace, that share
those names.
+
+Transformer layer classes are derived from the
+[`Layer`](../docs/latest/javadoc/org/apache/daffodil/runtime1/layers/api/Layer.html)
+base class.
+
+A name and target namespace, and hence a QName is required to identify a layer
for use from
+a DFDL schema.
+This namespace is OWNED by the layer.
+All DFDL variables
+defined in that namespace are either used to pass parameters to the
+layer code, or receive results (such as a checksum) back from the
+layer code.
+This is enforced.
+A layer that has no DFDL variables does
+not have to define a DFDL schema that defines the layer's target
+namespace, but any layer that uses DFDL variables *must* define a
+schema with the layer's namespace as its target namespace, with the
+variables declared in it (using `dfdl:defineVariable`).
+
+There is also an abstract base class for defining checksum layers
+called,
+
+ org.apache.daffodil.runtime1.layers.api.ChecksumLayer
+
+### Layer Limiting
+
+A layer can process an entire input file/stream.
+Such a layer is said to be _unlimited_.
+More commonly, a layer has a limited region within the data stream that it is
supposed to process.
+The restriction of the layer so that it only processes the expected part of
the data is called
+_layer limiting_.
+
+There are two ways to do layer limiting, that differ in an
+important way.
+
+#### Layer Limiting using Elements of Specified Length
+
+If the `sequence` for a layer is the model group of the
+complex type of an element, and that element has `dfdl:lengthKind` of
`'explicit'`, `'implicit'`,
+or `'pattern'`, then the length of the element limits the length of the layer
within it.
+
+In this case the length of the layer is limited when parsing, but it is NOT
limited when
+unparsing.
+
+#### Layer Limiting using the `fixedLength` Utility Layer
+
+If you use the layer `fixedLength`, that dictates the parse and
+unparse length to be the value of the `layerLength` DFDL variable.
+
+Similarly, if you use a `checksum` layer built with the Daffodil API
`ChecksumLayer` base class,
+the length is controlled for both parsing and unparsing as it works
+similarly to the way the `fixedLength` layer works.
+
+#### Layer Limiting using the `boundaryMark` Utility Layer
+
+The `boundaryMark` layer uses a variable named `boundaryMark` which provides a
parameter to the
+layer which is a delimiter of the layer data when parsing, and
+which is inserted after the (otherwise unbounded) layer data when unparsing.
+
+### Layer Variables:
+
+Each variable is either a parameter passed to a special setter named
+`setLayerVariableParameters`, or is a return variable which is populated
+from a special getter.
+The special setter args and getters taken
+together must use up all the DFDL variables defined in the layer's
+namespace. (See the javadoc for the Layer base class.)
+
+A special getter example for a checksum layer:
+
+ int getLayerVariableResult_checksum() { .... }
+
+This corresponds to, and returns the value that populates, the DFDL
+variable named `checksum` in the layer's namespace.
+That DFDL variable must have a DFDL type corresponding to the `int` type
(`xs:unsignedShort` or
+`xs:int`).
+
+Layers could have multiple result variables, but we have no actual
+examples of more than one value being returned.
+(Returning multiple variables is tested.
+We just have no real use cases for it at this time.)
+
+### Layer Exception Handling:
+
+The method
+
+ setProcessingErrorException(...)
+
+allows the layer to specify that if the layer throws specific
+exceptions or runtime exceptions that they are converted into processing
+errors.
+This eliminates most need for layer code to contain try-catches.
+
+#### About Testing:
+
+There are tests for every way that
+a user can goof up the definition of a layer class, and there are
+tests for `processingError`, `runtimeSchemaDefinitionError`, and throwing
+an `Exception` from every place a user-defined Layer could cause these.
+Parse errors cause backtracking in all sensible cases.
+Nothing aborts.
+
+## Compatibility with Daffodil 3.7.0 and prior
+
+This new Layer API is entirely incompatible with schemas or layer code
+from Daffodil 3.7.0 and all prior versions of Daffodil.
+
+The layer
+feature was just an experimental feature in Daffodil, so we reserved
+the right to change it, and for Daffodil 3.8.0 it has changed radically.
+
+However, it is our hope that the Layer API introduced in
+Daffodil 3.8.0 will prove to be stable and supportable long term.
+
+
+
+# Daffodil Built-In Layer Documentation
+
+## Line Folded Layers
+
+- Namespace URI: urn:org.apache.daffodil.layers.lineFolded
+- No DFDL Variables are read or set
+- There are two layer names for two variations:
+- - [lineFolded_IMF](#layer-name-linefolded_imf) - conforms to IETF RFC 2822
Internet Message Format (IMF)
+- - [lineFolded_iCalendar](#layer-name-linefolded_icalendar) - conforms to
IETF RFC 5545 iCalendar
+
+### General Usage
+
+These layers would normally be used in a context of either an enclosing element
+which bounds its length (`dfdl:lengthKind 'explicit'`, `'prefixed'`, or
`'pattern'`)
+or an enclosing utility layer that bounds its length (ex: `boundaryMark`).
+
+There is a limitation on the compatibility of line folding of data
+with adjacent parts of the format which also use line-endings.
+For example, line folding can interact badly with surrounding elements of
`dfdl:lengthKind
+'pattern'` if the pattern is, for example `".*?\\r\\n(?!(?:\\t|\\ ))"` which
is anything up to
+and including a CRLF not followed by a space or tab.
+The problem is that line folding
+converts isolated \n or \r into \r\n, and if this just happens to be followed
by a
+non space/tab character this will have inserted an end-of-data in the middle
of the
+data.
+
+### Layer Name: lineFolded_IMF
+
+For IMF, unfolding simply removes CRLFs if they are followed by a space or tab.
+
+When unparsing, the folding is more complex, as CRLFs can only be inserted
before
+a space/tab that appears in the data. If the data has no spaces, then no
+folding is possible.
+If there are spaces/tabs, the one closest to (and before) position 78 is used
unless it is
+followed by punctuation, in which case a prior space/tab (if it exists) is
used.
+(This preference for spaces not followed by punctuation is optional, it is
+not required, but is preferred in the IMF RFC.)
+
+Note: folding is done by some systems in a manner that does not respect
+character boundaries - i.e., in utf-8, a multi-byte character sequence may be
+broken in the middle by insertion of a CRLF. Hence, unfolding initially treats
+the text as iso-8859-1, i.e., just bytes, and removes CRLFs, then subsequently
+re-interprets the bytes as the expected charset such as utf-8.
+
+IMF is supposed to be US-ASCII, but implementations have gone to 8-bit
characters
+being preserved, so the above problem really can occur.
+
+IMF has a maximum line length of 998 characters per line excluding the CRLF.
+
+- _WARNING This check for 998 characters is not implemented (As of Daffodil
version 3.8.0)._
+
+The layer should fail (cause a parse error) if a line longer than this is
encountered
+or constructed after unfolding. When unparsing, if a line longer than 998
cannot be
+folded due to no spaces/tabs being present in it, then it is an unparse error.
+
+Note that i/vCalendar, vCard, and MIME payloads held by IMF do not run into
+the IMF line length issues, in that they have their own line length limits that
+are smaller than those of IMF, and which do not require accommodation by having
+pre-existing spaces/tabs in the data. So such data will always be short
+enough lines.
+
+### Layer Name: lineFolded_iCalendar
+
+For iCalendar (including vCard and vCalendar), the maximum is 75 bytes plus
the CRLF, for
+a total of 77. Folding is inserted by inserting CRLF + a space or tab. The
+CRLF and the following space or tab are removed to unfold. If data happened to
+contain a CRLF followed by a space or tab initially, then that will be lost
when
+the data is parsed.
+
+For MIME, the maximum line length is 76.
+
+
+-----------
Review Comment:
Add sections for other built-in layers.
##########
site/layers.md:
##########
@@ -0,0 +1,382 @@
+---
+layout: page
+title: Layers - Pluggable Extensions to Enable Algorithmic Transformations in
DFDL
+description: Pluggable Extensions to Enable Algorithmic Transformations in DFDL
+group: nav-right
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements. See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License. You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+## Introduction
+
+This page describes a DFDL language extension known as _Layers_.
+
+A _layer_ is an algorithmic transformation of the data stream that cannot be
expressed using
+regular DFDL properties.
+When parsing it is like a pre-processing of the data stream which
+happens before parsing.
+When unparsing it is like a post-processing of the data stream which
+happens after unparsing.
+The layer can underlie a part of the data stream, or all of it.
+
+### Built-in Layers
+
+Daffodil includes several built-in layers:
+- base64_MIME
+- fourbyteswap
+- twobyteswap
+- gzip
+- lineFolded_IMF
+- lineFolded_iCalendar
+
+Daffodil also includes two utility layers that are used in combination with
other layers to
+isolate the subset of the data stream the layer algorithm will operate upon.
+These are:
+- boundaryMark
+- fixedLength
+
+Each of the built-in layers will be documented separately below with examples
of their usage.
+
+### Custom Plug-In Layers
+
+Additional layers can be written in Java or Scala and deployed as _plug-ins_
for Daffodil.
+These are generally packaged as DFDL _layer schemas_, a kind of _component
schema_,
+that provide the layer packaged for import by other DFDL _assembly_ schemas
that use the
+layer in the data format they describe.
+
+## Transforming Layers and Checksum Layers
+
+There are two different kinds of layers, though they share many
characteristics. They are
+_transforming_ layers, and _checksum_ layers. Both run small algorithms over
part (or all) of
+the data stream. The difference is the purpose of the algorithm and its
output.
+
+#### Transforming Layers
+
+These layers decode data (when parsing), and encode data (when unparsing).
+The simplest example of a transforming layer is the `base64_MIME` layer which
+decodes the well known base64 encoding which is commonly used to encode binary
+data inside textual data formats.
+
+Besides decoding and encoding the data stream for the parser/unparser,
transforming layers can be
+parameterized using DFDL variables.
+They can also assign computed result values to DFDL variables, though this is
uncommon.
+
+Custom transforming layers are created by deriving an implementation from the
Daffodil API's
+[`Layer`](../docs/latest/javadoc/org/apache/daffodil/runtime1/layers/api/Layer.html)
class
+which is introduced in a later section.
+
+#### Checksum Layers
+
+Checksum layers are a simplified kind of layers which do not decode or encode
data, they simply
+pass-through the data unmodified, but while doing so they compute a checksum,
hash, or Cyclic Redundancy Check (CRC) over the data stream.
+The value of the checksum (or hash or CRC) is assigned to a DFDL variable as
the result of the
+layer. This makes the value available for use by the DFDL schema that uses the
checksum layer.
+When parsing, the value of this DFDL variable can then be compared to a
checksum field in the
+data, and either an invalid data element or an parse-error can be created if
the checksum in the
+data stream does not match the computed value.
+When unparsing, the value of this DFDL variable can be written to an element
using the
+`dfdl:outputValueCalc` property.
+
+An example of a layer plug-in is in the <a
href="https://github.com/DFDLSchemas/ethernetIP">EthernetIP</a>
+DFDL schema, which uses a Daffodil layer to describe the IPv4 packet header
checksum algorithm.
+
+Custom checksum layers are created by deriving an implementation class from
the Daffodil API's
+[`ChecksumLayer`](../docs/latest/javadoc/org/apache/daffodil/runtime1/layers/api/ChecksumLayer.
+html)
+class, which is introduced in a later section.
+
+## Layer Composition
+
+Layers can be piled on top of each other. Once a layer is in place, another
layer can be used in
+conjunction with it.
+In the section on [Using Layers](#UsingLayers) below we will look at an
example that uses both the
+built-in _gzip_ layer, and the _base64_MIME_ layer together.
+
+# Using Layers
+
+To use a layer you must know
+- the layer's namespace URI
+- the layer's name
+- the names of any layer parameter variables
+- the names of any layer result variables
+
+## Example: Line Folding
+
+As a first example, let's look at the line folding layer, specifically the
`lineFolded_IMF`layer,
+which is built-in to Daffodil.
+
+Line folding is a way of encoding textual data formats so that no line of text
is longer than
+a limited line length.
+
+Consider this data :
+```
+Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
+ tempor incididunt ut labore et dolore magna aliqua. Ut enim ad Lorem
+ ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
+ tempor incididunt ut labore et dolore magna aliqua. Ut enim ad
+```
+This data has been _line folded_ at roughly 72 characters by inserting a CRLF
+before an existing space in the data.
+Each line ends with a CRLF (\r\n) and the second through fourth lines begin
+with a space as a way of indicating that they are extension lines.
+This data is supposed to be reassembled to form a long single-line string by
removing
+all CRLF pairs.
+
+The result should be this single longer string which does not contain any line
endings:
+```
+Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua. Ut enim ad Lorem ipsum dolor sit
amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore
et dolore magna aliqua. Ut enim ad
+```
+To achieve this we would use the `lineFolded_IMF` layer.
+This layer has a specific namespace which our DFDL schema will define a prefix
`lf` for like this:
+```xml
+xmlns:lf="urn:org.apache.daffodil.layers.lineFolded"
+```
+Our DFDL schema will import the layer schema for the line-folded layer with
this import:
+```xml
+<import namespace="urn:org.apache.daffodil.layers.lineFolded"
+
schemaLocation="/org/apache/daffodil/layers/xsd/lineFoldedLayer.dfdl.xsd"/>
+```
+Then the layer is incorporated into our DFDL schema like this:
+```xml
+ <sequence dfdlx:layer="lf:lineFolded_IMF">
+ ... elements to be parsed from unfolded layer data go here ...
+ </sequence>
+```
+You can see that use of a layer is described using the `dfdx:layer` property,
and the specific layer
+is identified by a QName using the previously defined namespace prefix.
+The scope of the layer is the duration of the sequence it appears on.
+The `dfdlx:layer` property can only be used on an XSD `sequence`.
+
+The `lineFolded_IMF` layer does not define any DFDL variables in its namespace
as it has no
+parameters and produces no results.
+In fact, as of Daffodil 3.8.0, the `import` statement above is optional for the
+line-folded layers as the DFDL schema `lineFoldedLayer.dfdl.xsd` does not
contain any definitions.
+In the future however, parameters may be added, so for uniformity all layers
define a DFDL
+schema to be imported as part of using the layer.
+
+More detailed documentation for the [Line Folded Layers](#LineFoldedLayers) is
below.
+
+## Example: Base64, GZip, and BoundaryMark
+
+[EXAMPLE TBD] - should explain elements of specified length, or use of
boundaryMark or
+fixed length layers.
+
+# Defining Custom Plug-In Layers
+
+Plug-in custom layers are dynamically loaded using the Java Service Provider
Interface (SPI).
+Hence, they must be compiled into Jar files with specific META-INF metadata,
and must appear on
+the Java CLASSPATH so that they can be found and loaded.
+
+The layer API is defined via Java clases and interfaces to enable writing of
custom layers in
+either Java or Scala.
+(One of the built-in layers (Gzip) is written in Java now, by way of proving
+that one can write a Layer in Java.)
+
+A layer implementation class must obey various naming
+conventions that allow Java/Scala reflection to associate the names of
Java/Scala code methods and
+method arguments with DFDL variables in that layer's namespace, that share
those names.
+
+Transformer layer classes are derived from the
+[`Layer`](../docs/latest/javadoc/org/apache/daffodil/runtime1/layers/api/Layer.html)
+base class.
+
+A name and target namespace, and hence a QName is required to identify a layer
for use from
Review Comment:
Thiis paragraph to javadoc.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]