On 7 April 2015 at 12:22, Paul Gilmartin
<[email protected]> wrote:
> IIRC, AMATERSE has a restriction that a PDS can not be PACKed directly to a 
> tape,
> but must first be PACKED to DASD (another (large) workfile), then copied to 
> tape.

I don't know IYRC, but it sounds like one of those many arbitrary
restrictions to do with UNIX files on z/OS.

> I suspect this restriction arises from a need to POINT to a prologue block and
> update it in place at the end of the operation.

Perhaps, but I don't think so. I have looked at a number of header
blocks output by various implementations of the terse algorithm, and
at most they seem to be 12 bytes, containing no information that is
unavailable before the start of compression. In particular, there is
nothing about the compressed size or number of symbols. But that's the
overall header I'm talking about; for tersed PDS[E]s there is a
following member directory of some sort, and I suppose that might need
updating after compressing. But surely if that's the case it can only
be to allow selective decompression of members (by providing a member
offset into the compressed stream), and I don't think that's
supported. And in any case, this member directory is itself
compressed, so I think there is little chance that any header
information is updated based on anything known only after compression.

There is also a trailer block of some sort, but I haven't tried to
analyse it beyond noticing that it contains a time stamp, and that it
can be removed from the end of the compressed data without causing
decompression to complain. I was interested in the header only as part
of identifying its "magic" value, which seems close to impossible. It
is, however, possible to use the header to sanity check a putative
tersed file against claims made about it by the sender. If the lrecl,
blocksize, and recfm match what is expected, there's a reasonable
chance that it hasn't been ASCII corrupted or otherwise damaged in
transmission, and is worth a trial decompression.

One wonders why AMATERSE is still in use. The terse algorithm (IBM
expired US patent 4814746) has properties that suit it best to dynamic
use in devices like modems, where the data cannot be analyzed in
advance. There are more efficient and widely available compression
algorithms and implementations, including some with support in IBM
hardware and/or millicode.

Tony H.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Reply via email to