We finally got our problem resolved. Tivoli support recommended that we upgrade the client from 4.1.2.0 to 4.1.3.0 or 4.2.1.15 to get the fix for APAR IC29368. Tivoli support expected this to get the backup to run. Earlier ADSM-L postings indicated that the fix would cause the client to produce an accurate error message when it failed. I was not particularly surprised when it turned out that ADSM-L was right and Tivoli support was wrong.
The error message from the 4.1.3.0 client confirmed that we were short of memory. Tivoli support quoted an estimate of 300 bytes per file from the developers. On this basis we increased the maximum data segment size to a gigabyte. We were able to get a successful backup with this amount of memory. We ran backups with the incrbydate option after some of the failures. We thought this would give us acceptable backup coverage until we got the problem resolved. We were wrong. When the 4.1.3.0 client ran out of memory it still generated a message reporting a successful incremental backup of the large file system. The subsequent backup with the incrbydate option behaved as if there really had been a successful backup, skipping all files with change dates prior to the end of the failed backup. The 4.1.2.0 also displayed the message reporting a successful incremental backup in conjunction with the incorrect 'User abort' message. I strongly suspect that this led to the same problem with the subsequent use of the incrbydate option, but I have not been able to prove it. The memory shortage would have been an inconvenience but not a data integrity problem if the incrbydate option had worked. In actual fact, we ended up with a major data integrity exposure because of Tivoli's sloppy quality control.
