This sounds like a 1.4.8 vs 1.4.10 issue and may not be
Solaris related.
David R Boldt wrote:
We use Solaris 10 SPARC exclusively for our AFS servers.
After upgrading to 1.4.10 from 1.4.8 we had a very few
volumes that started spontaneously going off-line, recovering,
and then going off-line again until they needed to be salvaged.
I am assuming you compile the inode versions yourself as the OpenAFS
1.4.8 and 1.4.10 releases for Solaris 10 were all compiled with namei.
Hearing that this might be related to inode, we moved these
volumes to a set of little use fileservers that were running
namei at 1.4.10. It made no discernible difference.
So this may not be a namei vs inode issue.
Two volumes in particular accounted for >90% of our off-line
volume issues.
FileLog:
Mon Apr 27 10:56:09 2009 Volume 2023867468 now offline, must be salvaged.
Mon Apr 27 10:56:15 2009 Volume 2023867468 now offline, must be salvaged.
Mon Apr 27 10:56:15 2009 Volume 2023867468 now offline, must be salvaged.
Mon Apr 27 10:56:22 2009 fssync: volume 2023867469 restored; breaking
all call backs
(restored vol above being R/O for R/W in need of salvage)
Both of the volumes most frequently impacted have content
completely rewritten roughly every 20 minutes while being on
an automated replication schedule of 15 minutes. One of them
25MB, the other 95MB, both at about 80% quota.
How log does the replication take?
We downgraded just the fileserver binary to 1.4.8 on all of
our servers and have not seen a single off-line message in
36 hours.
-- David Boldt
<[email protected]>
--
Douglas E. Engert <[email protected]>
Argonne National Laboratory
9700 South Cass Avenue
Argonne, Illinois 60439
(630) 252-5444
_______________________________________________
OpenAFS-info mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-info