Dear AFS Developers,

Attached is a patch which I believe fixes two serious problems in the
AFS 1.4.x kernel module for Solaris 10 (AFS_SUN510_ENV).


1. SSYS process exiting considered harmful

 The first problem is that setting process flag SSYS on a process that
 exits, as the afs_osi_Invisible routine on Solaris 10 does, causes the
 system not to clean up the contract state of the process.  This leaves
 a dangling kernel-memory pointer in the contract table which used to
 point to the process struct.

 Any user can corrupt kernel memory and cause a panic with the 'ctstat'
 command and the system cannot shut down without either panicing or
 going into an infinite loop as svc.startd repeatedly tries to kill the
 non-existent process.

2. Taskq MSS job left running after shutdown

 The new 5.10u4 support using taskq's to schedule the MSS probing
 doesn't clean up when AFS is shut down.  If the module is then unloaded
 then the taskq is left scheduled to run pointing at a function that
 no longer exists.  Instant panic.


I really don't know why the code would set SSYS on a userland process
that's about to exit in the first place.  Can anyone shed any light?

I'm not sure of the placing of the cleanup code for case #2, as no
spot seems particularly better than another in afs_shutdown().

This is a diff against 1.4.5 but I think it should apply cleanly to
1.4.6.  I've no idea if these bugs are also present in the 1.5.x branch.

Since it is fairly small I've included it here.  I apologise if that's
against list etiquette.

Regards,

 - Mike
diff -uNr openafs-1.4.5.orig/src/afs/afs_call.c openafs-1.4.5/src/afs/afs_call.c
--- openafs-1.4.5.orig/src/afs/afs_call.c       2007-10-17 13:51:44.000000000 
+1000
+++ openafs-1.4.5/src/afs/afs_call.c    2008-01-30 18:02:20.008681000 +1100
@@ -1862,6 +1862,11 @@
 #else
     afs_termState = AFSOP_STOP_COMPLETE;
 #endif
+#ifdef AFS_SUN510_ENV
+    afs_warn("NetIfPoller... ");
+    rw_destroy(&afsifinfo_lock);
+    ddi_taskq_destroy(afs_taskq);
+#endif
     afs_warn("\n");
 
     /* Close file only after daemons which can write to it are stopped. */
diff -uNr openafs-1.4.5.orig/src/afs/afs_osi.c openafs-1.4.5/src/afs/afs_osi.c
--- openafs-1.4.5.orig/src/afs/afs_osi.c        2007-04-04 04:57:06.000000000 
+1000
+++ openafs-1.4.5/src/afs/afs_osi.c     2008-01-29 18:43:32.090145000 +1100
@@ -291,7 +291,7 @@
 {
 #ifdef AFS_LINUX22_ENV
     afs_osi_MaskSignals();
-#elif defined(AFS_SUN5_ENV)
+#elif defined(AFS_SUN5_ENV) && !defined(AFS_SUN510_ENV)
     curproc->p_flag |= SSYS;
 #elif defined(AFS_HPUX101_ENV) && !defined(AFS_HPUX1123_ENV)
     set_system_proc(u.u_procp);
diff -uNr openafs-1.4.5.orig/src/rx/SOLARIS/rx_knet.c 
openafs-1.4.5/src/rx/SOLARIS/rx_knet.c
--- openafs-1.4.5.orig/src/rx/SOLARIS/rx_knet.c 2007-10-05 12:54:10.000000000 
+1000
+++ openafs-1.4.5/src/rx/SOLARIS/rx_knet.c      2008-01-30 16:36:20.033430000 
+1100
@@ -591,6 +591,12 @@
     int index;
     uint_t mtu;
     uint64_t flags;
+    extern int afs_shuttingdown;
+
+    /* If we're shutting down we need to stop rescheduling more
+     * taskq runs so we can destroy the taskq */
+    if (afs_shuttingdown)
+       return;
 
     /* Get our permissions */
     cr = CRED();

Reply via email to