Hi, Steve,
  Did you do an fsck on the hard drive (or whatever the
SUN equivalent is these days)? I had a similar problem
recently where a system lost power. It started up ok
and recovered using the ext3 journal, but my data was missing
after the salvage. After a few salvage attempts, my data
was still missing. I stopped the AFS fileserver, unmounted
the partitions, used fsck to check them all manually, and
sure enough the partition that was flaking had errors.
Once I cleaned that up, I salvaged again, and voila!...
my data reappeared.
  I recommend checking the underlying filesystem for
errors. It may be too late if you've already started
restoring data to the partition... but perhaps for
future reference.

OS: RedHat 9
Kernel: 2.4.30
AFS: 1.2.13

Mike Polek
Pictage, Inc.


 > ---- Original Message ----
 > From: rader

More information, fwiw...

 - SalvageLog.old indicates (the initial) salvaging started
   at 01:07:43

 - BosLog indicates that that salvage exited with signal 15 at
   05:00:38

 - SalvageLog indicates another salvage--the one that went
   awry--started at 05:00:38 and completed 06:44:41

 - bos getrestart reports the server should restart for
   new binaries at "5:00 am"

It is possible the "restart for new binaries" erroneously happened,
and it kill -SIGTERM'ed the bos salvage which left the volume
in an inconsistent state that caused the subsequent salvage to
blow chunks??  (I'm under the general impression that interrupting
salvages is a bad idea.)

At any rate, I've turned off the "restarts for new binaries at
5:00 am" thing.

steve
- - -
systems & network manager
high energy physics
university of wisconsin

 > ---- Original Message ----
 > From: rader
 >
 > One of our servers (Solaris7 inode fileserver running 1.2.11) lost
 > power this morning and the resulting bos salvage on a large (50 GB)
 > volume removed about 600,000 files....  /usr/afs/logs/SalvageLog
 > reads, for example...
 >
 >  07/29/2005 06:19:26 dir vnode 87953: invalid entry: \
 >    ./cmsprod/cern/setup.sh (vnode 2258102, unique 14499243)
 >  07/29/2005 06:19:26 dir vnode 87953: ./cmsprod/cern/setup.sh \
 >    (vnode 2258102): unique changed from 14499243 to 0 -- deleted
 >
 > Does anybody have any suggestions about how to recover the lost
 > files??  (I'm restoring from tape now, but I'll still have the
 > busted volume around when I'm done.)
 >
 > steve
 > - - -
 > systems & network manager
 > high energy physics
 > university of wisconsin
_______________________________________________
OpenAFS-info mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-info

Reply via email to