2008/10/17 Tim Hare <[EMAIL PROTECTED]>: > We need to TERSE a fairly large (for us) amount of data. This data is in > multiple separate datasets now, but needs to be sent as one large sequential > dataset. We can TERSE the concatenated sequential input of course; but out > of curiosity I'm wondering: can you TERSE the individual components, > concatenate the results via IEBGENER, and the UNTERSE the resulting file on > the other end?
It's trivial to try, but I very much doubt it... > From what I remember about Lempel-Ziv, the "dictionary" is built as you go > along but it might mean that the second and subsequent files concatenated > would be read with incomplete information, resulting in erroneous > decompression results? Terse appears to be Lempel-Ziv-Wegner (or Welch, depending on whose expired patent you prefer W to stand for), but it is a particular implementation of a general algorithm, and there are header and trailer records, both undocumented. By inspection, the header is a pretty straightforward 12 byte piece that describes both some original dataset characteristics and some encoding method info, but the trailer is longer and less obvious. It looks to me as though the trailer is just informational, but I don't know if it contains enough information to be skipped over reliably. Regardless, the dictionary after the first compress/decompress operation would not be the same as the initial dictionary, and you would have no way to tell the decompressor to start with a virgin dictionary. Without knowing much about the encoding, you could terse and concatenate the segments, and then at the other end run a splitter program to scan through the compressed data looking for headers, and invoked the deterse for each segment. Unfortunately the headers are not uniquely identifiable, i.e. there is no eyecatcher, and a syntactically correct header could occur within the compressed data stream. So your splitter program would have to scan forward from the 13th byte, treating the data stream as 12-bit chunks, until you reach a zero chunk, indicating logical EOF, then figure out how to skip over the trailer, which doesn't appear to contain its own length, and scan for the next header. It's always possible AMATERSE already does this. Another approach might be to put the original multiple datasets into members of a PDS, and terse that with AMATERSE, which understands PDS[E]s. After the deterse, you would have an identical PDS, which could be easily turned back into a sequential dataset. Or run a DSS dump selecting your datasets, terse the output of that, then deterse and DSS restore at the other end. Tony H. ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html

