Title: Typical success rates?
Never have I had a day without failures. Here's a sample from my past
24 hours (v5.1MP5 backup server ... various clients numbering less than
150 ... reported errors only) ...
- machine - status explanation
- 01 - 41 - This is one of several laptops that are backed up
whenever it is connected, but isn't connected very often. I wish
NetBackup could poll these machines quietly and back them up when they
appear. (about 2 dozen job failures of this type have been omitted from
this report)
- 02 - 1 - A mailbox could not be enumerated. The Exchange person
may correct these someday.
- 03 - 54 - bpbrm listen for client timeout during accept from data
listen socket for 60 seconds (will look into this one, especially if it
repeats)
- 04 - 58 - cannot connect (application does not play well with
NetBackup client - only a few backups are successful)
- 05 - 1 - cannot open file - in use by another process (will try
to exclude these files because the error appears permanent).
- 06 - 6 - failed to backup requested files. This was an CINC
Oracle backup on an idle DB. Maybe I can adjust script to force a
change in the DB or avoid backing up no changes
- 07 - 6 - same as machine 06.
- 08 - 54 - timeout connecting to client. NetBackup server was
delayed obtaining a tape drive, causing Oracle/RMAN to give up (I
think).
- 09 - 41 - network connection timed out. This was at very end of
backup ("end writing" in job details). Happens occasionally with this
client.
- 10 - 1 - Some ".tmp" files in use by another process. Will add
"*.tmp" to exclude list, but probably at expense of slowing backups?
Also, unable to export RSM database.
- 11 - 58 - cannot connect to client. Client machine is spread all
over a table, with HP trying to find what's wrong with it. Has been
down for several *weeks*. Have manually extended expiration of
existing backups. Too bad you can't tell NetBackup to keep its last
full backups of a client & policy.
- 12 - 41 - similar to machines 06 and 09.
- 13 - 1 - Several "filemaker" files unavailable for backup. We
don't exclude because sometimes they can be backed up and that's better
than none.
- 14 - 41 - Another mysterious network connection timed out at or
near end of file system backup, when job began delayed with "busy
resources".
- 15 - 57 client connection refused. Similar to machine 04.
- 16 - 1 - A Windows file, access_log, has a portion locked by
another process. I cannot fix this without putting client in a policy
of its own, because the file is included for processing by a necessary
include list entry.
- 17 - 54 - Machine is powered off due to a power outage. User
doesn't care because he no longer works there and management hasn't
decided what to do yet.
- 18 - 1 - A few classical "in use" failures (Windows defender and
perfdata), as well as a "cannot open old TIR file" failure. Backing up
TIR files seems bogus, but I otherwise don't know how to avert the
problem.
- 19 - 58 - powered off due to same power outage as machine 17.
Machine is going away, but owner might resurrect it or want one more
backup.
- 20 - 58 - trying machine 11 backup again.
- 21 - 25 - cannot execute cmd on client. No idea why this
Exchange DB CINC backup failed (immediately). A later job worked.
- 22 - 1 - A relatively new Linux client trying to backup "sparse
file /sys/bus/pci/..." (many). Will suggest owner exclude.
- 23 - 1 - same as machine 22.
So that's about 50 jobs with errors out of about 375, or about 10-15%
of jobs. It gets better if you don't count status code 1s as failures,
worse if you consider a few clients have many file systems and
multi-stream enabled, and much better if you throw out all the failures
that are "expected"!
cheers, wayne
Whelan, Patrick wrote, in part, on 6/29/2006 1:48 PM:
Do you usually have a 100% success every backup
session? If not what is a typical success rate?
|
_______________________________________________
Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu