I have a PMR open on this, but I wanted to see if any of you have seen something like this...
Setup: TSM 7.1.1.3 / TDP for Oracle 7.1.0.0 clients, Oracle 11.2.0.3 TSM 6.3.5.0 server All systems are AIX 6.1 TL9 SP3 Our DBAs were running a cross-node restore of the previous night's backup of a database on one Oracle server (prod) to the other Oracle server (test). During the restore, a couple of the objects being restored failed with errors like this: > channel t2: ORA-19870: error while restoring backup piece n4q0s5ko_1_1 > ORA-19501: read error on file "n4q0s5ko_1_1", block number 2176513 (block > size=512) > ORA-27190: skgfrd: sbtread2 returned error > ORA-19511: Error received from media manager layer, error text: > ANS1271E (RC176) The compressed file is corrupted and cannot be expanded > correctly. The tdpoerror.log file additionally contained thousands of the following messages (the numbers in each message varied) for each failed object: > ANS0361I DIAG: The 6131499099th code was found to be out of sequence. > The code (307) was greater than (258), the next available slot in the string > table. The backups are indeed compressed. But they are not corrupt in TSM; a separate restore later successfully restored the objects that failed on the first try. Nothing in errpt to suggest storage or network errors. We've had a handful of these so far. The only changes of note in the environment that I can think of lately are the TSM API / TDPO client updates to 7.1 and the latest round of Oracle updates. Since the compression is happening at the TSM API level, I think I can rule out the Oracle CPU. The current word back from support is that a backup was occurring for the source client at the same as the restore to the target client, which "is completely against recommendation." I have to say my initial reaction to that statement was "you've got to be kidding me." I don't recall ever seeing such a recommendation. And these are Oracle databases, the DBAs are running log backups for them throughout the day every day... Has anyone else seen restore failures like this? Am I wrong to expect TDPO cross-node restores to work reliably while the source client is backing up more data? Thanks for any feedback or insight. =Dave -- Hello World. David Bronder - Systems Architect Segmentation Fault ITS-EI, Univ. of Iowa Core dumped, disk trashed, quota filled, soda warm. [email protected]
