Packed and Zoned Design details

Mike Beckerle Fri, 20 Oct 2017 11:18:48 -0700

Our initial goal for the packed/zoned number support is to be able to run the 
IBM-created schemas on github for ISO8583 format, and for IBM4690_TLOG format, 
as well as run all the scala-debug tests we have that use the packed/zoned and 
other features that we've not yet implemented.

What we need:

So, ISO8583 uses zoned numbers. IBM4690_TLOG uses "ibm4690Packed" binary
numbers.

Neither uses plain old "packed" decimal numbers, but in the daffodil-test-ibm1
module (tests contributed by IBM) there are tests that use regular old "packed"
decimal. So we should implement that and get those tests to run also, as part
of interoperability demonstration.

In the interest of reducing implementation risk, we should start by
implementing the smallest viable subset of the functionality.

And the JTOPEN library appears to be far from comprehensive, so if we use it,
we can only implement a starter set of functionality. However it is a good
place to start.

For example, we can assume (and check) that the dfdl:byteOrder is always
bigEndian - which is all that the JTOPEN library supports. later we can
implement and test littleEndian variants.

Some specifics:

For dfdl:binaryNumberRep="ibm4690Packed"

JTOPEN doesn't support this. In this variant of packed and in the IBM4690_TLOG
format, the bytes for the number are isolated by use of delimiters (delimited
binary - meaning the delimiter is known to be something that cannot appear in
the bytes of a packed number!)

We do support binary delimited for hexBinary data today, so the isolation of
the bytes can be based on that code.

We can write our own parser/unparser for ibm4690packed, or massage the bytes
into packed form, and then call the JTOPEN routine for packed.

DFDL requires 4-bit alignment for all the packed number types. So the
alignmentInBits lazy val in the Daffodil schema compiler, a check needs to be
made for the binaryNumberRep packed case to insure this 4-bit alignment.

For dfdl:binaryNumberRep="packed"

There is no DFDL property for specifying leading sign for packed. Sign is
always the final/last nibble of the last byte.

I used to think that like zoned, for packed you could specify if you wanted the
sign leading or trailing, but some web searching suggests only trailing sign
nibbles for "packed" representation (What cobol calls Computational-3 or Comp-3
type.)

But note that ibm4690Packed is a variant of packed with leading sign.

Initially we can require the dfdl:binaryPackedSignCodes to be specified, but
only accept "C D F C" as the 4 nibbles - assuming this is what the JTOPEN
library implements.

For the dfdl:binaryNumberCheckPolicy strict is specified by IBM4690_TLOG, but
lax is specified by the ISO8583 schema. So both must be supported, but
initially we can implement strict, and add lax later.

For dfdl:textNumberRep="zoned"

JTOPEN only suppports trailing overpunched sign.

So the dfdl:textNumberPattern, if it shows a sign "+" location, it must be
after the final digit. The ISO8583 schema doesn't do this. It shows a leading +
sign. However, all the data is actually unsigned, so there is no overpunched
minus-sign, so whether the "+" is first or last doesn't matter.

Here's a link to the variations in Cobol for specifying numbers with "Usage
Display" which means "text numbers"

https://supportline.microfocus.com/Documentation/books/rd60/lhpdf40m.htm

I include this link only by way of showing that Cobol data can have many more
variants than the JTOPEN library supports. Also of note is that Cobol's default
behavior for Usage Display is zoned trailing sign. Cobol code must add the
clause "SIGN TRAILING SEPARATE" or "SIGN LEADING SEPARATE" to get a
textNumberRep='standard' number.

Since JTOPEN will not support our functional needs, we will have to rewrite and
either contribute back to JTOPEN, or write our own library that is more
flexible. I would prefer to implement the bare minimum here that will let us
handle the github DFDL schemas for ISO8583, IBM4670TLOG.

For zoned, the dfdl:textZonedSignStyle of 'asciiStandard' is the only one
needed for ISO8583 or IBM4690_TLOG formats, as these use iso-8859-1 and
us-ascii encodings, so both ascii.

No TDML or unit tests exercise EBCDIC zoned data currently, so we can initially
focus on ascii only. We do claim to support EBCDIC encoding and do support it
for textNumberRep='standard', so we do need to support it for textNumberRep
'zoned'. This will result in a string that needs to be interpreted according to
textZonedSignStyle of 'asciiTranslatedEBCDIC'.

--------------------------

SIGN Clause -
supportline.microfocus.com<https://supportline.microfocus.com/Documentation/books/rd60/lhpdf40m.htm>
supportline.microfocus.com
The SIGN Clause. The SIGN clause specifies the position and the mode of
representation of the operational sign when it is necessary to describe these
properties ...

Packed and Zoned Design details

Reply via email to