(daffodil-site) branch main updated: Add doc for dfdlx: alignmentKind, direction, repType, bits functions, BLOBs.

mbeckerle Tue, 04 Nov 2025 09:19:54 -0800

This is an automated email from the ASF dual-hosted git repository.

mbeckerle pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/daffodil-site.git



The following commit(s) were added to refs/heads/main by this push:
     new 17f2e9c  Add doc for dfdlx: alignmentKind, direction, repType, bits 
functions, BLOBs.
17f2e9c is described below

commit 17f2e9c58ab3fb775a9adb8c8252309147ba4c8e
Author: Michael Beckerle <[email protected]>
AuthorDate: Tue Nov 4 12:18:01 2025 -0500

    Add doc for dfdlx: alignmentKind, direction, repType, bits functions, BLOBs.
    
    Note that dfdlx:repValueRanges is deprecated and is not documented
    for LTS. (Per Confluence page)
    
    Added table of contents to these complex pages.
    
    Removed doc of deprecated daf:error function.
    
    Also fix closed jira ticket reference on unsupported page
    
    DAFFODIL-3044
    
    # Conflicts:
    #       site/dfdl-extensions.md
    #       site/layers.md
---
 site/binary-large-objects.md | 171 ++++++++++++++++++++
 site/dfdl-extensions.md      | 363 +++++++++++++++++++++++++++++++++++++------
 site/layers.md               |  51 ++++--
 site/unsupported.md          |   2 +-
 4 files changed, 523 insertions(+), 64 deletions(-)

diff --git a/site/binary-large-objects.md b/site/binary-large-objects.md
new file mode 100644
index 0000000..07150f5
--- /dev/null
+++ b/site/binary-large-objects.md
@@ -0,0 +1,171 @@
+---
+description: Binary Large Objects Feature
+group: 'nav-right'
+layout: page
+title: 'Binary Large Objects Feature'
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+## Table of Contents
+{:.no_toc}
+<!-- The {: .no_toc } excludes the above heading from the ToC --> 
+
+1. yes, this is the standard Jekyll way to do a ToC (this line gets removed)
+{:toc}
+<!-- note the above line {:toc} cannot have whitespace at the start --> 
+
+
+<!--
+This page is linked from https://s.apache.org/daffodil-blob-feature.
+ If this page content moves, please update that link from https://s.apache.org.
+-->
+
+# Introduction
+
+Daffodil has implemented a DFDL extension that allows data much larger than 
memory to be manipulated.
+
+A variety of data formats, such as for image and video files, consist of 
fields of what is effectively metadata, surrounding large blocks of data 
containing compressed image or video data.
+
+An important use case for DFDL is to expose this metadata for easy use, and to 
provide access to 
+the large data via a streaming mechanism akin to opening a file, thereby 
avoiding
+large `xs:hexBinary` strings in the infoset.
+
+In RDBMS systems, BLOB (Binary Large Object) is the type used when the data 
row returned from an SQL query will not contain the actual value data, but 
rather a handle that can be used to open/read/write/close the BLOB.
+
+Daffodil has an analogous BLOB capability. 
+This enables processing of images or video of arbitrary size without the need 
to ever hold all the data in memory.
+
+This also bypasses the limitation on object size.
+
+
+# Type `xs:anyURI` and Property `dfdlx:objectKind`
+
+DFDL is extended to allow simple types to have the `xs:anyURI` type. 
+Elements with this type will be treated as BLOB objects.
+
+The `dfdlx:objectKind` property is added to define what type of object it is. 
+The valid value for this property is only `"bytes"` specifying binary large 
objects.
+All other values reserved for future extensions of this feature.
+
+An example of this usage in a DFDL schema may look something like this:
+
+```xsd
+<xs:schema
+  xmlns:dfdlx="http://www.ogf.org/dfdl/dfdl-1.0/extensions";
+  xmlns:dfdl="http://www.ogf.org/dfdl/dfdl-1.0/";>
+
+  <xs:element name="data" type="xs:anyURI" 
+    dfdlx:objectKind="bytes"
+    dfdl:lengthUnits="bytes"
+    dfdl:length="1024" />
+
+</xs:schema>
+```
+
+The resulting infoset (as XML) will look something like this:
+
+```xml
+<data>file:///path/to/blob/data</data>
+```
+
+With the 1024 bytes of data being written to a file at location 
`/path/to/blob/data`.
+
+The BLOB URI must always use the _file scheme_ and must be absolute. 
+
+# Daffodil BLOB API
+
+API calls are used to specify where Daffodil should write the BLOB files.
+
+Two functions are used on the Daffodil `InfosetOutputter`.
+
+The first API function allows a way to set the properties used when
+creating BLOB files, including the output directory, and prefix/suffix
+for the BLOB file.
+
+```scala
+/**
+ * Set the attributes for how to create blob files.
+ *
+ * @param dir the Path the the directory to create files. If the directory
+ *            does not exist, Daffodil will attempt to create it before
+ *            writing a blob.
+ * @param prefix the prefix string to be used in generating a blob file name
+ * @param suffix the suffix string to be used in generating a blob file name
+ */
+final def setBlobAttributes(directory: Path, prefix: String, suffix: String)
+```
+
+The second API function allows a way for the API user to get a list of
+all BLOB files that were created during `parse()`.
+
+```scala
+/**
+ * Get the list of blob paths that were output in the infoset.
+ *
+ * This is the same as what would be found by iterating over the infoset.
+ */
+final def getBlobFiles(): Seq[Path]
+```
+
+Note that no changes to the `unparse()` API are required, since the BLOB URI 
provides 
+all the necessary information to retrieve files containing BLOB data.
+
+BLOB files are not automatically deleted.
+It is the responsibility of the API user to determine when files are no 
+longer needed and remove them.
+
+# DFDL Expressions
+
+Any expression access to the _data_ of a BLOB element will result in a
+Schema Definition Error during schema compilation.
+
+The _length_ of a BLOB element is available since it is very common in
+data formats to include both a BLOB payload and the length of that
+payload. On unparse, we can calculate the length of the BLOB data so
+that the value can be output in a length field in the data. This is
+done using the regular `dfdl:contentLength()` and `dfdl:valueLength()`
+functions.
+
+
+# Testing DFDL Schemas using BLOBs via the TDML Runner
+
+The TDML language is extended to support the `xsi:type="xs:anyURI"` annotation 
on XML data elements.
+
+For example:
+
+```xml
+<tdml:dfdlInfoset>
+  <data xsi:type="xs:anyURI">path/to/blob/data</data>
+</tdml:dfdlInfoset>
+```
+
+The path provided as the URI value can be, and usually will be, a relative 
path within the 
+`src/test/resources` directory of the DFDL schema project. 
+During Infoset comparisons the TDML Runner will compare the contents of this 
file 
+with the BLOB file in the corresponding element (having type `xs:anyURI`) of 
the infoset.  
+
+BLOB files created when running the tests are deleted when the test completes.
+
+# Command Line Interface
+
+The CLI supports ad-hoc testing of the use of BLOBs.
+BLOBs are written to the directory given by the JVM _System Property_ 
`user.dir` into 
+a subdirectory of it named `daffodil-blobs`. 
+If it does not exist, Daffodil will attempt to create the `daffodil-blobs` 
directory.
+The CLI does not delete any BLOB files. 
diff --git a/site/dfdl-extensions.md b/site/dfdl-extensions.md
index 092d503..792fdb7 100644
--- a/site/dfdl-extensions.md
+++ b/site/dfdl-extensions.md
@@ -1,6 +1,6 @@
 ---
 layout: page
-title: DFDL Extensions
+title: Daffodil Extensions to the DFDL Language 
 group: nav-right
 ---
 <!--
@@ -22,37 +22,61 @@ limitations under the License.
 {% endcomment %}
 -->
 
+## Table of Contents 
+{:.no_toc} 
+<!-- The {: .no_toc } excludes the above heading from the ToC --> 
+
+1. yes, this is the standard Jekyll way to do a ToC (this line gets removed)
+{:toc}
+<!-- note the above line {:toc} cannot have whitespace at the start --> 
+
+# Introduction
+
 Daffodil provides extensions to the DFDL specification. 
-These properties are in the namespace defined by the URI 
+These functions and properties are in the namespace defined by the URI 
 ``http://www.ogf.org/dfdl/dfdl-1.0/extensions`` which is normally bound to the 
``dfdlx`` prefix 
 like so: 
 
-
 ``` xml
-<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema";
-           xmlns:dfdl="http://www.ogf.org/dfdl/dfdl-1.0/";
-           xmlns:dfdlx="http://www.ogf.org/dfdl/dfdl-1.0/extensions";
->
+<schema xmlns="http://www.w3.org/2001/XMLSchema";
+        xmlns:dfdl="http://www.ogf.org/dfdl/dfdl-1.0/";
+        xmlns:dfdlx="http://www.ogf.org/dfdl/dfdl-1.0/extensions";>
 ```
 
-The following symbols defined in this namespace are described below.
+The DFDL language extensions described below have Long Term Support (LTS) in 
Daffodil 
+going forward, and are proposed for inclusion in a future revision of the DFDL 
+standard. 
+DFDL schema authors can depend on the features and behaviors defined here 
without fear 
+that these extensions will be withdrawn in the future. 
 
-# Expression Functions
+# Binary Large Objects (BLOB) Feature
 
-## ``daf:error()``
+Daffodil supports processing data that contains large opaque binary objects,
+also known as _BLOBs_. 
+These enable processing of data types such as images, audio, or video where 
the 
+data content is surrounded by important metadata. 
+The DFDL Schema can expose the metadata fields for processing and carry 
+along the opaque BLOB data in files.
 
-A function that can be used in DFDL expressions. This functions does not 
return a value or accept any arguments. When called, it causes a Parse Error or 
Unparse Error.
+There is 
+[separate documentation for the Binary Large Object (BLOB) 
feature](/binary-large-objects).
 
-*This function is deprecated as of Daffodil 2.0.0. Use the ``fn:error(...)`` 
function instead.*
+# Expression Functions
 
-## ``dfdlx:trace($value, $label)``
+## `dfdlx:trace(value, label)`
 
-A function that can be used in DFDL expressions, similar to the ``fn:trace()`` 
function. This logs the string ``$label`` followed by ``$value`` converted to a 
string and returns ``$value``. The second argument must be of type 
``xs:string``.
+A function that can be used to debug DFDL expressions, similar to 
+the [XPath ``fn:trace(value, 
label)``](https://www.w3.org/TR/xpath-functions-31/#func-trace) 
+function. 
+This creates a message from the string argument ``label`` followed by 
``value`` converted to a 
+string and logs the message. 
+The function returns the ``value``. 
+The second `label` argument must be of type ``xs:string``.
 
-## ``dfdlx:lookAhead(offset, bitSize)``
+## `dfdlx:lookAhead(offset, bitSize)`
 
-Read ``bitSize`` bits, where the first bit is located at an ``offset`` (in 
bits)
-from the current location. The result is a ``xs:nonNegativeInteger``. 
Restrictions:
+Read `bitSize` bits, where the first bit is located at an ``offset`` (in bits)
+   from the current location. The result is a ``xs:nonNegativeInteger``. 
Restrictions:
    
 - offset >=0
 - bitSize >= 1
@@ -67,10 +91,12 @@ and data location.
 the data being read will not be used.
   
 ### Examples of `dfdlx:lookAhead`
-
+   
 The following two elements both populate element `a` with the value of the 
next 3 bits as an 
-unsignedInt. They are not completely equivalent because the first will consume 
3 bits of the 
+unsignedInt. 
+They are not completely equivalent because the first will consume 3 bits of 
the 
 input stream where the second will not advance the input stream.
+
 ```xml
 <xs:element name="a" type="xs:unsignedInt" dfdl:length="3" 
dfdl:lengthUnits="bits" />
 
@@ -81,51 +107,296 @@ In this case the choice of elements `a` vs. `b` depends 
on the value of the `tag
 found after fields `a` and `b`:
 ```
 <xs:choice dfdl:choiceDispatchKey="{ dfdlx:lookAhead(16,8) }">
-<xs:element name="a" type="xs:int" dfdl:length="16" dfdl:choiceBranchKey="1"/>
-<xs:element name="b" type="xs:int" dfdl:length="16" dfdl:choiceBranchKey="2"/>
+  <xs:element name="a" type="xs:int" dfdl:length="16" 
dfdl:choiceBranchKey="1"/>
+  <xs:element name="b" type="xs:int" dfdl:length="16" 
dfdl:choiceBranchKey="2"/>
 </xs:choice>
 <xs:element name="tag" type="xs:int" dfdl:length="8" />
-     ```
-# Bitwise Functions
+```
 
-TBD, but the complete list (all ``dfdlx``) is `BitAnd`, `BitNot`, `BitOr`, 
`BitXor`, `LeftShift`, 
-`RightShift`
+## Bitwise Functions: `bitAnd`, `bitOr`, `bitXor`, `bitNot`, `leftShift`, 
`rightShift`
 
-## ``dfdlx:doubleFromRawLong`` and ``dfdlx:doubleToRawLong``
+These functions are defined on types `long`, `int`, `short`, `byte`, 
`unsignedLong`, 
+`unsignedInt`, `unsignedShort`, and `unsignedByte`
 
-Converting binary floating point numbers to/from base 10 text can result in 
lost information.
-The base 10 representation, converted back to binary representation, may not 
be bit-for-bit 
-identical. These functions can be used to carry 8-byte double precision IEEE 
floating point 
-numbers as type `xs:long` so that no information is lost. The DFDL schema can 
still obtain 
-and operate on the floating point value by converting these `xs:long` values 
into type 
-`xs:double`, and back if necessary for unparsing a new value. 
+### `dfdlx:bitAnd(arg1, arg2)`
 
-# Properties
+This computes the bitwise AND of two integers. 
 
-## ``dfdlx:parseUnparsePolicy``
+- Both arguments must be signed, or both must be unsigned.
+- If the two arguments are not the same type the smaller one is converted into 
the type of the 
+larger one. 
+- If the smaller argument is signed, this conversion does sign-extension.
+- The result type is the that of the largest argument. 
 
-A property applied to simple and complex elements, which specifies whether the 
element supports only parsing, only unparsing, or both parsing and unparse. 
Valid values for this property are ``parse``, ``unparse``, or ``both``. This 
allows one to leave off properties that are required for only parse or only 
unparse, such as ``dfdl:outputValueCalc`` or ``dfdl:outputNewLine``, so that 
one may have a valid schema if only a subset of functionality is needed.
+### `dfdlx:bitOr(arg1, arg2)`
 
-All elements must have a compatible parseUnparsePolicy with the compilation 
parseUnparsePolicy (which is defined by the root element daf:parseUnparsePolicy 
and/or the Daffodil parseUnparsePolicy tunable) or it is a Schema Definition 
Error. An element is defined to have a compatible parseUnparsePolicy if it has 
the same value as the compilation parseUnparsePolicy or if it has the value 
``both``.
+This computes the bitwise OR of two integers.
 
-For compatibility, if this property is not defined, it is assumed to be 
``both``.
+- Both arguments must be signed, or both must be unsigned.
+- If the two arguments are not the same type the smaller one is converted into 
the type of the
+larger one.
+- If the smaller argument is signed, this conversion does sign-extension.
+- The result type is the that of the largest argument.
+
+### `dfdlx:bitXor(arg1, arg2)`
+
+This computes the bitwise Exclusive OR of two integers.
+
+- Both arguments must be signed, or both must be unsigned.
+- If the two arguments are not the same type the smaller one is converted into 
the type of the
+larger one. 
+- If the smaller argument is signed, this conversion does sign-extension.
+- The result type is the that of the largest argument.
+
+### `dfdlx:bitNot(arg)`
+
+This computes the bitwise NOT of an integer. Every bit is inverted. The result 
type is the same 
+as the argument type. 
 
-## ``dfdlx:layer``
+### `dfdlx:leftShift(value, shiftCount)`
 
-[Layers](/layers) provide algorithmic capabilities for decoding/encoding data 
or computing 
-checksums. Some are built-in to Daffodil. New layers can be created in 
Java/Scala and 
-plugged-in to Daffodil dynamically. 
+This is the _logical_ shift left, meaning that bits are shifted from 
less-significant positions 
+to more-significant positions. 
 
-## ``dfdlx:direction``
+- The left-most bits shifted out are discarded. 
+- Zeros are shifted in for the right-most bits. 
+- The result type is the same as the `value` argument type. 
+- It is a processing error if the `shiftCount` argument is < 0.
+- It is a processing error if the `shiftCount` argument is greater than the 
number of 
+  bits in the type of the value argument. 
 
-TBD
+### `dfdlx:rightShift(value, shiftCount)`
 
-## ``dfdlx:repType``, ``dfdlx:repValues``, and ``dfdlx:repValueRanges``
+This is the _arithmetic_ shift right, meaning bits move from most-significant 
to 
+less-significant positions.
+If _logical_ (zero-filling) shift right is needed, you must use unsigned types.
 
-TBD
+- The `value` argument is shifted by the `shiftCount`.
+- The right-most bits shifted out are discarded. 
+- If the `value` is signed, then the sign bit is shifted in for the left-most 
bits.
+- If the `value` is unsigned, then zeros are shifted in for the left-most 
bits. 
+- The result type is the same as the `value` argument type.
+- It is a processing error if the `shiftCount` argument is < 0.
+- It is a processing error if the `shiftCount` argument is greater than the 
number of
+  bits in the type of the value argument.
 
-# Extended Behaviors
+## `dfdlx:doubleFromRawLong(longArg)` and `dfdlx:doubleToRawLong(doubleArg)`
+
+IEEE binary float and double values that are not NaN will parse to base 10 
text and unparse back
+to the same exact IEEE binary bits. 
+However, the same cannot be said for NaN (not a number) values, of which there 
are many bit 
+patterns. 
+To preserve float and double NaN values bit for bit you can use these 
functions to compute
+`xs:long` values that enable the DFDL Infoset to preserve the bits of a float 
or double value 
+even if it is a NaN.
+
+# Properties
+
+## `dfdlx:alignmentKind`
+
+Valid values for this property are `manual` or `automatic` with `automatic` 
being the default 
+behavior.
+When specified, the `manual` value turns off all automatic alignment based on 
the 
+`dfdl:alignment` and `dfdl:alignmentUnits` properties.
+The schema author must use `dfdl:leadingSkip`, `dfdl:trailingSkip`, or just 
ensure all the 
+elements/terms are aligned based on their length.
+
+This property is sometimes needed to facilitate creation of schemas where 
interactions occur 
+between computed lengths (that is, stored length fields) and 
+alignment regions that are automatically being inserted. 
+It can be easier to do all alignment manually than to debug these 
interactions. 
+
+## `dfdlx:parseUnparsePolicy`
+
+A property applied to simple and complex elements, which specifies whether the 
element supports only parsing, only unparsing, or both parsing and unparse. 
Valid values for this property are ``parse``, ``unparse``, or ``both``. This 
allows one to leave off properties that are required for only parse or only 
unparse, such as ``dfdl:outputValueCalc`` or ``dfdl:outputNewLine``, so that 
one may have a valid schema if only a subset of functionality is needed.
+
+All elements must have a compatible parseUnparsePolicy with the compilation 
parseUnparsePolicy (which is defined by the root element daf:parseUnparsePolicy 
and/or the Daffodil parseUnparsePolicy tunable) or it is a Schema Definition 
Error. An element is defined to have a compatible parseUnparsePolicy if it has 
the same value as the compilation parseUnparsePolicy or if it has the value 
``both``.
+
+For compatibility, if this property is not defined, it is assumed to be 
``both``.
+
+## `dfdlx:layer`
+
+_Layers_ provide algorithmic capabilities for decoding/encoding data or 
computing 
+   checksums. Some are built-in to Daffodil. New layers can be created in 
Java/Scala and 
+   plugged-in to Daffodil dynamically. 
+There is [separate Layer documentation](/layers).
+
+## `dfdlx:direction`
+
+This property can appear only on DFDL `defineVariable` statement annotations.
+This property has possible values `both` (the default), `parseOnly`, or 
`unparseOnly`. 
+It declares 
+whether the variable is to be available for only parsing, only unparsing, or 
both. 
+Since this is a newly introduced extension property and existing schemas won't 
contain a definition 
+for it, it has a default value of `both`. 
+
+This property can conflict with the `dfdlx:parseUnparsePolicy` property which 
takes the same 
+values (`both`, `parseOnly`, and `unparseOnly`).
+If `dfdlx:parseUnparsePolicy='parseOnly'` then it is a Schema Definition Error 
if 
+variables in the DFDL schema have `dfdlx:direction='unparseOnly'`. 
+Similarly if `dfdlx:parseUnparsePolicy='unparseOnly'` then it is a Schema 
Definition Error if
+variables in the DFDL schema have `dfdlx:direction='parseOnly'`. 
+
+It is a Schema Definition Error if a variable defined with direction 
`parseOnly` is accessed 
+from an expression used by the unparser. 
+Symmetrically, it is a Schema Definition Error if a variable defined with 
direction
+`unparseOnly` is accessed from an expression used by the parser.
+This error is detected at DFDL schema compilation time, not runtime. 
+
+These properties take expressions for their values and are generally evaluated 
at both parse and 
+unparse time. 
+Hence, unless the whole schema is constrained by `dfdlx:parseUnparsePolicy`, 
any expressions for 
+these properties[^moreProps] cannot  
+cannot reference DFDL variables with `dfdlx:direction` of `parseOnly` or 
`unparseOnly`. 
+
+- `byteOrder`
+- `encoding`
+- `initiator`
+- `terminator`
+- `separator`
+- `escapeCharacter`
+- `escapeEscapeCharacter`
+- `length`
+- `occursCount`
+- `textStandardDecimalSeparator`
+- `textStandardGroupingSeparator`
+- `textStandardExponentRep`
+- `binaryFloatRep`
+- `textBooleanTrueRep`
+- `textbooleanFalseRep`
+- `calendarLanguage`
+- `dfdl:setVariable`, a `dfdl:newVariableInstance` default value expression, 
or a
+  `dfdl:defineVariable` default value expression when
+  that variable being set/defaulted is itself referenced from a another 
expression and the variable 
+  being set/defaulted has `dfdlx:direction` of `both` (the default)
+
+<!-- footnotes must be all one big long line --> 
+[^moreProps]: New properties added as part of errata corrections to the DFDL 
v1.0 standard which take expressions for their values will need to be added to 
this list or those for parser-specific or unparser-specific properties. 
+  
+Parser-specific expressions include
+
+- `dfdl:inputValueCalc`
+- `dfdl:length` (when dfdl:lengthKind='explicit')
+- `dfdl:occursCount` (when `dfdl:occursCountKind='expression')
+- `dfdl:choiceDispatchKey`
+- the `message` and `test` attributes of the `dfdl:assert` and 
`dfdl:discriminator` statement annotations
+- `dfdl:setVariable`, a `dfdl:newVariableInstance` default value expression, 
or a
+  `dfdl:defineVariable` default value expression when
+  that variable being set/defaulted is itself referenced from a another 
expression being
+  accessed at parser creation time, and the variable being set/defaulted has 
`dfdlx:direction`
+  of `parseOnly`
+
+Unparser-specific expressions include:
+
+- `dfdl:outputValueCalc`
+- `dfdl:length` (when `dfdl:lengthKind='explicit')
+- `dfdl:outputNewLine`
+- `dfdl:setVariable`, a `dfdl:newVariableInstance` default value expression, 
or a 
+  `dfdl:defineVariable` default value expression when 
+  that variable being set/defaulted is itself referenced from a another 
expression being 
+  accessed at unparser creation time, and the variable being set/defaulted has 
`dfdlx:direction` 
+  of `unparseOnly` 
+
+
+## Enumerations: `dfdlx:repType`, `dfdlx:repValues`
+
+These properties work together to allow DFDL schemas to define _enumerations_;
+that is, symbolic representations for integer constants. 
+When parsing, Daffodil will convert these integers into the corresponding 
string values. 
+When unparsing, Daffodil will convert strings into the corresponding integers. 
+
+An element of type (or derived from) `xs:string` can be defined using XSD 
`enumeration` facets 
+which constrain the valid values of this string. 
+These enumeration values are effectively symbolic constants. 
+The `dfdlx:repType` and `dfdlx:repValues` properties are then used to define 
the correspondence of 
+the symbolic strings to the corresponding integer values.
+
+### `dfdlx:repType`
+
+The value of this property is an XSD QName of a simple type definition that 
must be derived
+from `xs:int`, or `xs:unsignedInt`. 
+A simple type definition for a string can be annotated with `dfdlx:repType` 
+in order to declare that the representation of the string is not as text 
characters but is a 
+numeric integer value. 
+The type referenced from `dfdlx:repType` is usually a fixed length binary 
integer, but can be any
+DFDL type derived from `xs:int` or `xs:unsignedInt`, with any DFDL 
representation properties. 
+
+The mapping between the representation integer and the symbolic constants is 
specified using the 
+`dfdlx:repValues` property. 
+
+### `dfdlx:repValues`
+
+The value of this property is one or more integer values within 
+the numeric range defined for the type referenced by `dfdlx:repType`. When 
more than one value 
+is specified, they are in a whitespace separated list. 
+
+This property is placed on the `xs:enumeration` facets of a symbolic string 
constant having a 
+`dfdlx:repType`. 
+At parse time, if the value of the `dfdlx:repType` integer is found within the 
`dfdlx:repValues` 
+list, then the infoset value for the symbolic string gets the corresponing 
enumeration facet value.
+It is a parse error if the `dfdlx:repType` integer is not found in any of the 
`dfdlx:repValues` 
+lists of the `xs:enumeration` facets.
+At unparse time, the symbolic constant is mapped to the first integer in the 
`dfdlx:repValues` list. 
+It is an unparse error if the symbolic string value is not found among the 
`xs:enumeration` 
+facet values of the symbolic string type.
+
+### Examples of Enumerations in Daffodil DFDL
+
+A simple example of a basic enum is:
+
+```xsd
+  <simpleType name="rep3Bit" dfdl:lengthUnits="bits" dfdl:length="3" 
dfdl:lengthKind="explicit">
+    <restriction base="xs:unsignedInt"/>
+  </simpleType>
+    
+  <simpleType name="precedenceEnum" dfdlx:repType="pre:rep3Bit">
+    <restriction base="xs:string">
+      <enumeration value="Reserved_0" dfdlx:repValues="0"/>
+      <enumeration value="Reserved_1" dfdlx:repValues="1"/>
+      <enumeration value="Emergency" dfdlx:repValues="2"/>
+      <enumeration value="Reserved_3" dfdlx:repValues="3"/>
+      <enumeration value="Flash" dfdlx:repValues="4"/>
+      <enumeration value="Immediate" dfdlx:repValues="5"/>
+      <enumeration value="Priority" dfdlx:repValues="6"/>
+      <enumeration value="Routine" dfdlx:repValues="7"/>
+    </restriction>
+  </simpleType>
+  ```
+
+Above we see the `dfdlx:repType` is `rep3Bit` which is a 3 bit 
`xs:unsignedInt`. This can
+represent the values 0 to 7 which one can see are the `dfdlx:repValues` of the 
`xs:enumeration`
+facets for this enumeration string type which is named `precedenceEnum`.
+
+In the above you can also see that the symbolic strings are in one-to-one 
correspondence with 
+every possible value of the 3-bit representation integer. 
+This one-to-one correspondence assures that data that is first parsed and then 
unparsed will 
+recreate the exact numeric bits used.
+
+However, in data security applications the following may be preferred:
+```xsd
+ <simpleType name="precedenceEnum" dfdlx:repType="pre:rep3Bit">
+    <restriction base="xs:string">
+      <enumeration value="Reserved" dfdlx:repValues="0 1 3"/>
+      <enumeration value="Emergency" dfdlx:repValues="2"/>
+      <enumeration value="Flash" dfdlx:repValues="4"/>
+      <enumeration value="Immediate" dfdlx:repValues="5"/>
+      <enumeration value="Priority" dfdlx:repValues="6"/>
+      <enumeration value="Routine" dfdlx:repValues="7"/>
+    </restriction>
+  </simpleType>
+```
+
+In the above we see that three numeric values, 0, 1, and 3 are the 
`dfdlx:repValues` mapped to 
+the symbolic string `Reserved`. 
+This technique has the advantage of blocking covert signals being transmitted 
by use of the 
+different reserved values since when unparsed, the constant string `Reserved` 
will always be 
+_canonicalized_ to integer 0. 
+Putting data into canonical form when unparsing generally improves data 
security.
+
+# Extended Behaviors for DFDL Types
 
 ## Type ``xs:hexBinary``
 
 Daffodil allows `dfdlx:lengthUnits='bits'` for this simple type. 
+
+----
diff --git a/site/layers.md b/site/layers.md
index db79eb2..373f70d 100644
--- a/site/layers.md
+++ b/site/layers.md
@@ -3,8 +3,7 @@ description: Pluggable Extensions to Enable Algorithmic 
Transformations
   in DFDL
 group: 'nav-right'
 layout: page
-title: 'Layers - Algorithmic
-  Extensions for DFDL'
+title: 'Layers - Algorithmic Extensions for DFDL'
 ---
 <!--
 {% comment %}
@@ -24,12 +23,13 @@ See the License for the specific language governing 
permissions and
 limitations under the License.
 {% endcomment %}
 -->
-
 ## Table of Contents
 {:.no_toc}
+<!-- The {: .no_toc } excludes the above heading from the ToC --> 
 
-1. use ordered table of contents
+1. yes, this is the standard Jekyll way to do a ToC (this line gets removed)
 {:toc}
+<!-- note the above line {:toc} cannot have whitespace at the start --> 
 
 # Introduction
 
@@ -58,7 +58,7 @@ There is no limit to this depth.
 In the section on [Using Layers](#UsingLayers) below we will look at an 
example that uses 
 multiple layers together. 
 
-# Built-in Layers
+## Built-in Layers
 
 Daffodil includes several built-in layers:
 - [base64_MIME](#base64-mime-layer)
@@ -78,20 +78,20 @@ Each of the built-in layers will be
 [documented separately below](#daffodil-built-in-layer-documentation) with 
examples of their 
 usage.
 
-# Custom Plug-In Layers
+## Custom Plug-In Layers
 
 Additional layers can be written in Java or Scala and deployed as _plug-ins_ 
for Daffodil.
 These are generally packaged as DFDL _layer schemas_, a kind of _component 
schema_,
 that provide the layer packaged for import by other DFDL _assembly_ schemas 
that use the 
 layer in the data format they describe. 
 
-# Layer Kinds: Transforming Layers and Checksum Layers
+## Layer Kinds: Transforming Layers and Checksum Layers
 
 There are two different kinds of layers, though they share many 
characteristics. They are 
 _transforming_ layers, and _checksum_ layers. Both run small algorithms over 
part (or all) of 
 the data stream. The difference is the purpose of the algorithm and its 
output. 
 
-## Transforming Layers
+### Transforming Layers
 
 These layers decode data (when parsing), and encode data (when unparsing). 
 The simplest example of a transforming layer is the `base64_MIME` layer which 
@@ -106,7 +106,7 @@ Custom transforming layers are created by deriving an 
implementation from the Da
 
[`Layer`](/docs/latest/javadoc/org/apache/daffodil/runtime1/layers/api/Layer.html)
 class 
 which is introduced in a later section. 
 
-## Checksum Layers
+### Checksum Layers
 
 Checksum layers are a simplified kind of layer which do not decode or encode 
data, they simply 
 pass through the data unmodified, but while doing so they compute a checksum, 
hash, or Cyclic
@@ -129,6 +129,8 @@ Custom checksum layers are created by deriving an 
implementation class from the
 html) 
 class, which is introduced in a later section.
 
+----
+
 # Using Layers
 
 To use a layer you must know
@@ -201,9 +203,7 @@ Layers may specify restrictions on the minimum and maximum 
allowed values of the
 and passing an out-of-range value for the variable is a processing error.
 
 
-# Examples
-
-## Line Folding
+# Example: Line Folding
 
 Consider the line folding layer, specifically the `lineFolded_IMF` layer, 
 which is built-in to Daffodil.
@@ -220,7 +220,7 @@ Lorem ipsum dolor sit amet, consectetur adipiscing elit, 
sed do eiusmod
 ```
 This data has been _line folded_ at roughly 72 characters by inserting a CRLF
 before an existing space in the data. 
-Each line ends with a CRLF (\r\n) and the second through fourth lines begin 
+Each line ends with a CRLF (`\r\n`) and the second through fourth lines begin 
 with a space as a way of indicating that they are extension lines. 
 This data is supposed to be reassembled to form a long single-line string by 
removing
 all CRLF pairs.
@@ -265,7 +265,7 @@ Other examples will show how the layer length can be 
limited to a sub-region of
 
 More detailed documentation for the [Line Folded Layers](#line-folded-layers) 
is below.
 
-## Base64, GZip, and BoundaryMark Layers used Together
+# Example: Base64, GZip, and BoundaryMark Layers used Together
 
 In this example, the data consists of a preliminary string, a section of 
CSV-like data, and a 
 final string element.
@@ -433,6 +433,8 @@ This group definition is the last thing in the schema:
 ```
 The above schema works both to parse, but also to unparse this data. 
 
+----
+
 # Using Custom Plug-In Layers
 
 A custom plug-in layer is used in the same manner as the built-in Daffodil 
layers with just a few
@@ -463,6 +465,9 @@ base class.
 Further details on how to define custom plug-in layers is in the Javadoc for 
the 
 [Layer 
API](/docs/latest/javadoc/org/apache/daffodil/runtime1/layers/api/package-summary.html)
 
+----
+----
+
 # Daffodil Built-In Layer Documentation
 
 Each of the layers built-in to the Daffodil implementation are documented in a 
section below 
@@ -476,6 +481,8 @@ The built-in layers are:
 - [lineFolded_IMF](#line-folded-layers)
 - [lineFolded_iCalendar](#line-folded-layers)
 
+----
+
 ## Base64 MIME Layer
 
 - Name: base64_MIME
@@ -491,11 +498,13 @@ This uses the standard `java.util.Base64` classes, 
specifically the MIME encodin
 
 This is specified by [RFC 2045](https://www.ietf.org/rfc/rfc2045.txt).
 The encoded output must be represented in lines of no more than 76 characters 
-each and uses a carriage return '\r' followed immediately by a linefeed '\n' 
as the line separator. 
+each and uses a carriage return `\r` followed immediately by a linefeed `\n` 
as the line separator. 
 No line separator is added to the end of the encoded output. 
 All line separators or other characters not found in the base64 alphabet table 
are ignored in
 decoding operation.
 
+----
+
 ## BoundaryMark Layer
 
 - Name: boundaryMark
@@ -537,6 +546,9 @@ of any child element enclosed within the layer, or even the 
lengths of other lay
 within the scope of this boundary mark layer are not considered and do not 
disrupt the search 
 for the boundary mark string.
 
+
+----
+
 ## Byte-Swapping Layers
 
 - Layer Names:
@@ -570,6 +582,8 @@ order 2 1 4 3 6 5 8 7 10 9.
 If `requireLengthInWholeWords` is bound to "yes", then if the length is not a 
multiple of the 
 word size a processing error occurs. 
 
+----
+
 ## FixedLength Layer
 
 - Name: fixedLength
@@ -588,6 +602,8 @@ word size a processing error occurs.
 Suitable only for small sections of data, not large data streams or large 
files.
 The entire fixed length region of the data will be pulled into a byte buffer 
in memory.
 
+----
+
 ## GZIP Layer
 
 - Name: gzip
@@ -610,6 +626,8 @@ depending on the Java version used.
 To avoid inconsistent behavior of test failures that expect a certain byte 
value this layer
 always writes a consistent header (header byte 9 of 255) regardless of the 
Java version.
 
+----
+
 ## Line Folded Layers
 
 - Layer Names:
@@ -624,7 +642,6 @@ always writes a consistent header (header byte 9 of 255) 
regardless of the Java
       <xs:import namespace="urn:org.apache.daffodil.layers.lineFolded"
          
schemaLocation="/org/apache/daffodil/layers/xsd/lineFoldedLayer.dfdl.xsd"/>
 ```
-
 ### General Usage
 
 There is a limitation on the compatibility of line folding of data
@@ -633,7 +650,7 @@ For example, line folding can interact badly with 
surrounding elements of `dfdl:
 'pattern'` if the pattern is, for example `".*?\\r\\n(?!(?:\\t|\\ ))"` which 
is anything up to
 and including a CRLF not followed by a space or tab. 
 The problem is that line folding
-converts isolated \n or \r into \r\n, and if this just happens to be followed 
by a
+converts isolated `\n` or `\r` into `\r\n`, and if this just happens to be 
followed by a
 non space/tab character this will have inserted an end-of-data in the middle 
of the
 data.
 
diff --git a/site/unsupported.md b/site/unsupported.md
index f85faf3..4d2db35 100644
--- a/site/unsupported.md
+++ b/site/unsupported.md
@@ -51,7 +51,7 @@ that there has been no intention to support as of this 
release.
 # XML Schema Features
 
 * fixed {% jira 117 %}
-* default {% jira 115 %} {% jira 1277 %}
+* default {% jira 115 %}
 
 # Properties and Property Enumerations

(daffodil-site) branch main updated: Add doc for dfdlx: alignmentKind, direction, repType, bits functions, BLOBs.

Reply via email to