mbeckerle commented on code in PR #195:
URL: https://github.com/apache/daffodil-site/pull/195#discussion_r2482347530
##########
site/dfdl-extensions.md:
##########
@@ -87,46 +107,346 @@ found after fields `a` and `b`:
<xs:element name="tag" type="xs:int" dfdl:length="8" />
```
-Bitwise Functions
+## Bitwise Functions: `bitAnd`, `bitOr`, `bitXor`, `bitNot`, `leftShift`,
`rightShift`
+
+These functions are defined on types `long`, `int`, `short`, `byte`,
`unsignedLong`,
+`unsignedInt`, `unsignedShort`, and `unsignedByte`
+
+### `dfdlx:bitAnd(arg1, arg2)`
+
+This computes the bitwise AND of two integers.
+
+- Both arguments must be signed, or both must be unsigned.
+- If the two arguments are not the same type the smaller one is converted into
the type of the
+larger one.
+- If the smaller argument is signed, this conversion does sign-extension.
+- The result type is the that of the largest argument.
+
+### `dfdlx:bitOr(arg1, arg2)`
+
+This computes the bitwise OR of two integers.
+
+- Both arguments must be signed, or both must be unsigned.
+- If the two arguments are not the same type the smaller one is converted into
the type of the
+larger one.
+- If the smaller argument is signed, this conversion does sign-extension.
+- The result type is the that of the largest argument.
+
+### `dfdlx:bitXor(arg1, arg2)`
+
+This computes the bitwise Exclusive OR of two integers.
+
+- Both arguments must be signed, or both must be unsigned.
+- If the two arguments are not the same type the smaller one is converted into
the type of the
+larger one.
+- If the smaller argument is signed, this conversion does sign-extension.
+- The result type is the that of the largest argument.
- : TBD, but the complete list (all ``dfdlx``) is `BitAnd`, `BitNot`,
`BitOr`, `BitXor`, `LeftShift`,
- `RightShift`
+### `dfdlx:bitNot(arg)`
-``dfdlx:doubleFromRawLong`` and ``dfdlx:doubleToRawLong``
+This computes the bitwise NOT of an integer. Every bit is inverted. The result
type is the same
+as the argument type.
- : Converting binary floating point numbers to/from base 10 text can result
in lost information.
-The base 10 representation, converted back to binary representation, may not
be bit-for-bit
- identical. These functions can be used to carry 8-byte double precision
IEEE floating point
- numbers as type `xs:long` so that no information is lost. The DFDL schema
can still obtain
- and operate on the floating point value by converting these `xs:long`
values into type
- `xs:double`, and back if necessary for unparsing a new value.
+### `dfdlx:leftShift(value, shiftCount)`
-### Properties
+This is the _logical_ shift left, meaning that bits are shifted from
less-significant positions
+to more-significant positions.
-``dfdlx:parseUnparsePolicy``
+- The left-most bits shifted out are discarded.
+- Zeros are shifted in for the right-most bits.
+- The result type is the same as the `value` argument type.
+- It is a processing error if the `shiftCount` argument is < 0.
+- It is a processing error if the `shiftCount` argument is greater than the
number of
+ bits in the type of the value argument.
- : A property applied to simple and complex elements, which specifies
whether the element supports only parsing, only unparsing, or both parsing and
unparse. Valid values for this property are ``parse``, ``unparse``, or
``both``. This allows one to leave off properties that are required for only
parse or only unparse, such as ``dfdl:outputValueCalc`` or
``dfdl:outputNewLine``, so that one may have a valid schema if only a subset of
functionality is needed.
+### `dfdlx:rightShift(value, shiftCount)`
- All elements must have a compatible parseUnparsePolicy with the
compilation parseUnparsePolicy (which is defined by the root element
daf:parseUnparsePolicy and/or the Daffodil parseUnparsePolicy tunable) or it is
a Schema Definition Error. An element is defined to have a compatible
parseUnparsePolicy if it has the same value as the compilation
parseUnparsePolicy or if it has the value ``both``.
+This is the _arithmetic_ shift right, meaning bits move from most-significant
to
+less-significant positions.
+If _logical_ (zero-filling) shift right is needed, you must use unsigned types.
- For compatibility, if this property is not defined, it is assumed to be
``both``.
+- The `value` argument is shifted by the `shiftCount`.
+- The right-most bits shifted out are discarded.
+- If the `value` is signed, then the sign bit is shifted in for the left-most
bits.
+- If the `value` is unsigned, then zeros are shifted in for the left-most
bits.
+- The result type is the same as the `value` argument type.
+- It is a processing error if the `shiftCount` argument is < 0.
+- It is a processing error if the `shiftCount` argument is greater than the
number of
+ bits in the type of the value argument.
-``dfdlx:layer``
+## `dfdlx:doubleFromRawLong(longArg)` and `dfdlx:doubleToRawLong(doubleArg)`
- : [Layers](/layers) provide algorithmic capabilities for decoding/encoding
data or computing
+IEEE binary float and double values that are not NaN will parse to base 10
text and unparse back
+to the same exact IEEE binary bits.
+However, the same cannot be said for NaN (not a number) values, of which there
are many bit
+patterns.
+To preserve float and double NaN values bit for bit you can use these
functions to compute
+`xs:long` values that enable the DFDL Infoset to preserve the bits of a float
or double value
+even if it is a NaN.
+
+
+
+# Properties
+
+## `dfdlx:alignmentKind`
+
+Valid values for this property are `manual` or `automatic` with `automatic`
being the default
+behavior.
+When specified, the `manual` value turns off all automatic alignment based on
the
+`dfdl:alignment` and `dfdl:alignmentUnits` properties.
+The schema author must use `dfdl:leadingSkip`, `dfdl:trailingSkip`, or just
ensure all the
+elements/terms are aligned based on their length.
+
+This property is sometimes needed to facilitate creation of schemas where
interactions occur
+between computed lengths (that is, stored length fields) and
+alignment regions that are automatically being inserted.
+It can be easier to do all alignment manually than to debug these
interactions.
+
+## `dfdlx:parseUnparsePolicy`
+
+A property applied to simple and complex elements, which specifies whether the
element supports only parsing, only unparsing, or both parsing and unparse.
Valid values for this property are ``parse``, ``unparse``, or ``both``. This
allows one to leave off properties that are required for only parse or only
unparse, such as ``dfdl:outputValueCalc`` or ``dfdl:outputNewLine``, so that
one may have a valid schema if only a subset of functionality is needed.
+
+All elements must have a compatible parseUnparsePolicy with the compilation
parseUnparsePolicy (which is defined by the root element daf:parseUnparsePolicy
and/or the Daffodil parseUnparsePolicy tunable) or it is a Schema Definition
Error. An element is defined to have a compatible parseUnparsePolicy if it has
the same value as the compilation parseUnparsePolicy or if it has the value
``both``.
+
+For compatibility, if this property is not defined, it is assumed to be
``both``.
+
+## `dfdlx:layer`
+
+_Layers_ provide algorithmic capabilities for decoding/encoding data or
computing
checksums. Some are built-in to Daffodil. New layers can be created in
Java/Scala and
plugged-in to Daffodil dynamically.
+There is [separate Layer documentation](/layers).
+
+## `dfdlx:direction`
+
+This property can appear only on DFDL `defineVariable` statement annotations.
+This property has possible values `both` (the default), `parseOnly`, or
`unparseOnly`.
+It declares
+whether the variable is to be available for only parsing, only unparsing, or
both.
+Since this is a newly introduced extension property and existing schemas won't
contain a definition
+for it, it has a default value of `both`.
+
+This property can conflict with the `dfdlx:parseUnparsePolicy` property which
takes the same
+values (`both`, `parseOnly`, and `unparseOnly`).
+If `dfdlx:parseUnparsePolicy='parseOnly' then it is a Schema Definition Error
if
+variables in the DFDL schema have `dfdlx:direction='unparseOnly'.
+Similarly if `dfdlx:parseUnparsePolicy='unparseOnly' then it is a Schema
Definition Error if
+variables in the DFDL schema have `dfdlx:direction='parseOnly'.
+
+It is a Schema Definition Error if a variable defined with direction
`parseOnly` is accessed
+from an expression used by the unparser.
+Symmetrically, it is a Schema Definition Error if a variable defined with
direction
+`unparseOnly` is accessed from an expression used by the parser.
+This error is detected at DFDL schema compilation time, not runtime.
+
+These properties take expressions for their values and are generally evaluated
at both parse and
+unparse time.
+Hence, unless the whole schema is constrained by `dfdlx:parseUnparsePolicy`,
any expressions for
+these properties[^moreProps] cannot
+cannot reference DFDL variables with `dfdlx:direction` of `parseOnly` or
`unparseOnly`.
+
+- `byteOrder`
+- `encoding`
+- `initiator`
+- `terminator`
+- `separator`
+- `escapeCharacter`
+- `escapeEscapeCharacter`
+- `length`
+- `occursCount`
+- `textStandardDecimalSeparator`
+- `textStandardGroupingSeparator`
+- `textStandardExponentRep`
+- `binaryFloatRep`
+- `textBooleanTrueRep`
+- `textbooleanFalseRep`
+- `calendarLanguage`
+- `dfdl:setVariable`, a `dfdl:newVariableInstance` default value expression,
or a
+ `dfdl:defineVariable` default value expression when
+ that variable being set/defaulted is itself referenced from a another
expression and the variable
+ being set/defaulted has `dfdlx:direction` of `both` (the default)
+
+[^moreProps] New properties added as part of errata corrections to the DFDL
v1.0 standard which
+take expressions for their values will need to be added to this list or those
for
+parser-specific or unparser-specific properties.
+
+Parser-specific expressions include
+
+- `dfdl:inputValueCalc`
+- `dfdl:length` (when dfdl:lengthKind='explicit')
+- `dfdl:occursCount` (when `dfdl:occursCountKind='expression')
+- `dfdl:choiceDispatchKey`
+- the `message` and `test` attributes of the `dfdl:assert` and
`dfdl:discriminator` statement annotations
+- `dfdl:setVariable`, a `dfdl:newVariableInstance` default value expression,
or a
+ `dfdl:defineVariable` default value expression when
+ that variable being set/defaulted is itself referenced from a another
expression being
+ accessed at parser creation time, and the variable being set/defaulted has
`dfdlx:direction`
+ of `parseOnly`
+
+Unparser-specific expressions include:
+
+- `dfdl:outputValueCalc`
+- `dfdl:length` (when `dfdl:lengthKind='explicit')
+- `dfdl:outputNewLine`
+- `dfdl:setVariable`, a `dfdl:newVariableInstance` default value expression,
or a
+ `dfdl:defineVariable` default value expression when
+ that variable being set/defaulted is itself referenced from a another
expression being
+ accessed at unparser creation time, and the variable being set/defaulted has
`dfdlx:direction`
+ of `unparseOnly`
+
+
+## Enumerations: `dfdlx:repType`, `dfdlx:repValues`, and `dfdlx:repValueRanges`
+
+These properties work together to allow DFDL schemas to define _enumerations_;
+that is, symbolic representations for integer constants.
+When parsing, Daffodil will convert these integers into the corresponding
string values.
+When unparsing, Daffodil will convert strings into the corresponding integers.
+
+An element of type (or derived from) `xs:string` can be defined using XSD
`enumeration` facets
+which constrain the valid values of this string.
+These enumeration values are effectively symbolic constants.
+The `dfdlx:repType` and `dfdlx:repValues` properties are then used to define
the correspondence of
+the symbolic strings to the corresponding integer values.
+
+### `dfdlx:repType`
+
+The value of this property is an XSD QName of a simple type definition that
must be derived
+from `xs:int`, or `xs:unsignedInt`.
+A simple type definition for a string can be annotated with `dfdlx:repType`
+in order to declare that the representation of the string is not as text
characters but is a
+numeric integer value.
+The type referenced from `dfdlx:repType` is usually a fixed length binary
integer, but can be any
+DFDL type derived from `xs:int` or `xs:unsignedInt`, with any DFDL
representation properties.
+
+The mapping between the representation integer and the symbolic constants is
specified using the
+`dfdlx:repValues` and/or `dfdlx:repValueRanges` properties.
+
+### `dfdlx:repValues`
+
+The value of this property is one or more integer values within
+the numeric range defined for the type referenced by `dfdlx:repType`. When
more than one value
+is specified, they are in a whitespace separated list.
+
+This property is placed on the `xs:enumeration` facets of a symbolic string
constant having a
+`dfdlx:repType`.
+At parse time, if the value of the `dfdlx:repType` integer is found within the
`dfdlx:repValues`
+list, then the infoset value for the symbolic string gets the corresponing
enumeration facet value.
+It is a parse error if no `xs:enumeration` has a `dfdlx:repValues` nor
`dfdlx:repValueRanges`
+(see below) assign a symbolic equivalent to the `dfdlx:repType` integer.
+At unparse time, the symbolic constant is mapped to the first integer in the
dfdlx:repValues list.
+It is an unparse error if the symbolic string value is not found among the
`xs:enumeration`
+facet values of the symbolic string type.
+
+
+### `dfdlx:repValueRanges`
+
+The value of this property is a list of integers of even length 2 or greater.
The integers at
+odd positions (starting with position 1) define the inclusive lower bound of a
range of
+integers.
+The integers at even positions (starting with position 2) define the
corresponding inclusive
+upper bound of a range of integers.
+
+This property is placed on the `xs:enumeration` facets of a symbolic string
constant having a
+`dfdlx:repType`.
+
+At parse time, the integer value of the `dfdlx:repType` is used to search the
numeric ranges.
+If it is found in any of the numeric ranges for a specific `xs:enumeration`
facet, then the
+facet's value is used as the corresponding symbolic value.
+It is a parse error if no `xs:enumeration` has a `dfdlx:repValues` (see above)
nor
+`dfdlx:repValueRanges` assign a symbolic equivalent to the `dfdlx:repType`
integer.
+At unparse time, the symbolic string value's corresponding `xs:enumeration`
facet is found and
+if the `xs:enumeration` contains both `dfdlx:repValues` and
`dfdlx:repValueRanges` then the
+`dfdlx:repValues` is used to determine the corresponing `dfdl:repType` integer
value to unparse,
+as described above for the `dfdl:repValues` property.
+If the `xs:enumeration` has no `dfdlx:repValues` property, then the smallest
numeric value in the
+`dfdlx:repValueRanges` list is unparsed for the `dfdlx:repType` integer.
+
+TBD: is this correct? Or is it the lower bound of the first range?
Review Comment:
Fix these TBDs
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]