Solaris 2.6 / TSM Server 4.1.2.12. A few days ago we experienced a spontaneous crash/reboot during a TSM BACKUP DB operation. Following the reboot, our TSM Server software came up normally and we repeated the BACKUP DB operation, which reported successful completion. Shortly afterward, we did a DRM Prepare, MOVE MEDIA and MOVE DRMEDIA, and then initiated our 2nd daily BACKUP DB. The 2nd BACKUP DB failed immediately with error: "ANR9999D ic.c(329): Zero bit count mismatch for SMP page addr 738304; Zero Bits =105, HeaderZeroBits = 0."
Aside from the BACKUP DB failure, the TSM Server is fully operational, performing scheduled tasks, client backups, migrations, etc. At this point we're in a Catch-22. Several days have passed. We can't get the DB to back itself up (full or incremental), and fear that once we halt the TSM Server software, it won't restart. We would be forced to restore from the last good DB backup which would now cost us several nights' "successful" backup cycles. Is there a way to fix or recover from this without losing all those client backups ? This is critical & getting more so daily - I need some fast answers & a plan of attack from TSM'ers who have been through this: - has anyone successfully recovered from an SMP Page mismatch error without a DB Restore? - given that the DB is up/functional now, and has performed several night's backups, is there a way to export/preserve client backup activity since the incident occurred ? - if we bring down TSM, should we disable all or specific DB and/or Log mirrors first? - how can I determine which DB volume contains the problem SMP page number? - has anyone (incl Tivoli support/consulting) ever successfully repaired an SMP page mismatch error, & if so how, using what tools/utilities? - will AUDIT DB detect/fix an SMP page header mismatch error? - would UNLOAD DB / LOAD DB be better than AUDIT DB, or do we need to run both, and if so, in what order? -rsvp, thanks Kent Monthei GlaxoSmithKline
