------------------------------<snip>------------------------------
I'm wondering if your primary rule of not compressing a file unless it
will exceed its architectural limit may have blocked the opportunity for
you to come across cases where compression is not a waste of time.
Synchronous remote copy is one area where compression of datasets
created or updated in the batch critical path can buy back the impact of
write elongation, and sometimes leave you with improved elapsed times.
For example where TC, SRDF or PPRC introduce a 100% write response time
impact in a father-to-son update, but the input and output files are
compressed to 40% of their original size. That's a net run time
improvement of 20% due to IO reduction.
Another example are datasets written or updated once, and then read by
many programs. Even a reduction in size of only 40% results in a
corresponding reduction in IO activity multiplied by the number of times
the file is read. Using LZW compression means that there is only a small
increase in CPU Time to decompress the file multiple times, usually
around 15-20% of the CPU cost of compressing the file. The more you read
the dataset, the more benefit you obtain.
These are two examples where I have used DFSMS compression to make
significant improvements to batch run time. The only hiccup was when IBM
moved the compression assist instructions to firmware on the G3 CMOS
processor (it may have been G4). They were back in the hardware by G6
and all the advantages of asymmetric CPU cost for compress/decompress
returned.
Huffman may not be an appropriate compression algorithm for this because
the implementations I have experience with (e.g. DFSMShsm, DFSMSdss)
have an equal cost for compress and decompress. The symmetrical CPU cost
heavily impacts the value of a compressed input file.
------------------------------<unsnip>-------------------------------
In the case of ARCHIVER, I do compression because I have to handle
multiple record formats and lengths. Compression was easier than trying
to devise a segmentation scheme that was sufficiently flexible.
Never mind the space issues that can arise.
My current private archive contains nearly 15 million logical records of
source/CLIST/REXX and fits in about 250 cylinders of VSAM cluster.
Rick
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html