On Wed, 11 Apr 2007, Derrick J Brashear wrote:
On Wed, 11 Apr 2007, Stephan Wiesand wrote:
One of our systems panicked two times within 2 hours yesterday, at the same
location in the OpenAFS client. I attached the kernel's last words below.
This is an SL3 system, kernel 2.4.21-47.0.1.ELsmp, i686. The client build
has two patches on top of 1.4.4: linux-task-pointer-safety-20070320 from
CVS, and the one from
https://lists.openafs.org/pipermail/openafs-devel/2007-March/014985.html
afs_HashOutDCache has
/* if this guy is in the hash table, pull him out */
if (adc->f.fid.Fid.Volume != 0) {
i = DCHash(&adc->f.fid, adc->f.chunk);
us = afs_dchashTbl[i];
if (us == adc->index) {
..
} else {
/* somewhere on the chain */
while (us != NULLIDX) {
if (afs_dcnextTbl[us] == adc->index) {
/* found item pointing at the one to delete */
afs_dcnextTbl[us] = afs_dcnextTbl[adc->index];
break;
}
us = afs_dcnextTbl[us];
}
if (us == NULLIDX)
osi_Panic("dcache hc");
so basically you appear to have an unhashed dcache entry. Either there's a
locking bug or something is becoming erroneously unhashed.
How reproducible is it?
Not easily. I tried to apply some cache pressure by reading several large
files at the same time, but no luck yet. I'll try to get my suspect to
admit what he actually did.
--
Stephan Wiesand
DESY - DV -
Platanenallee 6
15738 Zeuthen, Germany
_______________________________________________
OpenAFS-info mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-info