On Thu, Nov 29, 2012 at 9:08 AM, Kevin Hildebrand <ke...@umd.edu> wrote: > > Hi, we've got a Solaris 10 box running OpenAFS-1.6.1 that is periodically > becoming non-responsive and requiring a hard restart. > > From what I can see from looking at crash dumps, it appears that the threads > that are hanging are in the process of doing file access (stat/lookup) in > AFS. > > For example: >> >> 2a1043be951::stack
two choices of this mutex are the AFS_GLOCK or the vnode mutex for the afs root vnode. try the ::findlocks macro and see what's holding it? > > mutex_vector_enter+0x428(190d458, 2, 707dcf50, fffb14c0ca47a08a, > 2a100297c81, 0) > afs_root+0x3c(6003f54ce40, 2a1043bf548, 1, 0, 6002542e840, 0) > fsop_root+0x10(6003f54ce40, 2a1043bf548, 6002153acc8, 2420, 0, 7afc1490) > traverse+0x7c(2a1043bf678, 2a1043bf548, 0, 0, 6002542e840, 6002153acc8) > lookuppnvp+0x3d0(2a1043bf940, 0, 6002542e840, 2a1043bf678, 2a1043bf680, > 60021529a40) > lookuppnat+0x120(60021529a40, 0, 1, 0, 2a1043bfad8, 0) > lookupnameat+0x5c(0, 0, 1, 0, 2a1043bfad8, 0) > cstatat_getvp+0x198(ffd19400, 100ae7708, 1, 1, 2a1043bfad8, 0) > cstatat+0x40(ffffffffffd19553, 100ae7708, 1000, 100405a50, 0, 10) > syscall_trap+0xac(100ae7708, 100405a50, 100b52fe8, 16, 100b7bffc, 4) >> >> > > For this particular crash dump, I have hundreds of threads that are stuck in > this location. > > I'd appreciate any suggestions on how to debug this further. > > Thanks, > Kevin > > -- > Kevin Hildebrand > University of Maryland, College Park > Division of IT > _______________________________________________ > OpenAFS-info mailing list > OpenAFS-info@openafs.org > https://lists.openafs.org/mailman/listinfo/openafs-info > -- Derrick _______________________________________________ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info