Re: [Pvfs2-developers] pvfs2-client-core behavior on failed remount

Michael Moore Mon, 08 Feb 2010 11:11:03 -0800

Attached is the cvs diff with the requested flags. I noticed how useless 
the previous patch format I used was when I was applying the cancel I/O 
patch :)


It does lead to a pvfs2-client-core restart loop if the connection to 
server never comes back. However, the loop will be tempered by the 
BMI timeout and retry counts so it should be a reasonably long loop (I 
don't recall the defaults off hand, but should only be every couple 
minutes).

Michael

On Mon, Feb 08, 2010 at 01:51:16PM -0500, Phil Carns wrote:
> Hi Michael,
> 
> Could you regenerate this patch with "diff -Naupr"  (or "cvs diff 
> -Naup")?  The -u in particular makes it a little easier to read/apply.
> 
> I think this is the same issue as described in this open trac entry, 
> which would be great to knock out:
> 
> https://trac.mcs.anl.gov/projects/pvfs/ticket/66
> 
> I haven't traced through the code yet to look myself, but is there any 
> chance of the pvfs2-client-core getting stuck in a restart loop?
> 
> -Phil
> 
> Michael Moore wrote:
> > Attached is a patch against head for the issue. The comments largely
> > describe what's going on. If pvfs2-client-core is re-started due to a
> > segfault with a previously mounted PVFS filesystem any requests will
> > cause the process to spin.
> > 
> > The patch adds a check at the end of the process_vfs_request
> > while(s_client_is_processing) loop to check if mount_complete is set to
> > failed. If so, it exits pvfs2-client-core with a non-zero value so a new
> > client-core will get restarted and mount/add the filesystem if
> > exec_remount completes successfully. If everything looks okay can you 
> > apply it to head?
> > 
> > Thanks,
> > Michael
> > 
> > On Mon, Feb 01, 2010 at 02:09:46PM -0500, Michael Moore wrote:
> >> On Mon, Feb 01, 2010 at 02:04:22PM -0500, Michael Moore wrote:
> >>> We recently saw some strange behvior in pvfs2-client-core when a server 
> >>> goes away 
> >>> (via segfault) and the client is unable to re-mount the filesystem. The 
> >>> pvfs2-client-core process takes up 100% of a core just spinning on 
> >>> process_vfs_request -> PVFS_sys_testsome and subsequent calls. Full 
> >>> backtrace 
> >>> follows. 
> >>>
> >>> In looking at the code in pvfs2-client-core it seems to  assume that the 
> >>> re-mount 
> >>> will always succeed (around line 3579). However, I don't know that it's 
> >>> the root 
> >>> cause of the issue. I'll continue looking but wondered if anyone had 
> >>> ideas on this. 
> >>> This appears to be re-creatable by:
> >>> 1) cleanly mounting and using the filesystem for some I/O
> >>> 2) either killing the servers or adding iptables rules to the client to 
> >>> reject 
> >>> traffic to the server,
> >>> 3) Attemping I/O from the client
> >> I neglected to mention the pvfs2-client-core must be killed after 
> >> attempting 
> >> I/O traffic to the 'failed' server as I only saw this behavior after the 
> >> client 
> >> core restarts. I'm still digging into the reason the client core 
> >> segfaulted after a 
> >> failed I/O flow.
> >>
> >> Michael
> >>
> >>> The operation correctly dies with connection refused but the client 
> >>> begins to spin 
> >>> taking up CPU.
> >>>
> >>> (gdb) bt
> >>> #0  0x00511402 in __kernel_vsyscall ()
> >>> #1  0x001d8023 in poll () from /lib/libc.so.6
> >>> #2  0x0082ebed in PINT_dev_test_unexpected (incount=5, 
> >>> outcount=0xbf84e4f8, info_array=0x8bbbc0, max_idle_time=10) at 
> >>> src/io/dev/pint-dev.c:398
> >>> #3  0x00848f50 in PINT_thread_mgr_dev_push (max_idle_time=10) at 
> >>> src/io/job/thread-mgr.c:332
> >>> #4  0x00844caf in do_one_work_cycle_all (idle_time_ms=10) at 
> >>> src/io/job/job.c:5238
> >>> #5  0x008454a1 in job_testcontext (out_id_array_p=0xbf8515f0, 
> >>> inout_count_p=0xbf8521f4, returned_user_ptr_array=0xbf851df4, 
> >>> out_status_array_p=0xbf84e5f0, timeout_ms=10, context_id=0) at 
> >>> src/io/job/job.c:4273
> >>> #6  0x00857dba in PINT_client_state_machine_testsome 
> >>> (op_id_array=0xbf8522a8, op_count=0xbf8528c4, user_ptr_array=0xbf8527a8, 
> >>> error_code_array=0xbf8526a8, timeout_ms=10) at 
> >>> src/client/sysint/client-state-machine.c:756
> >>> #7  0x00857fb9 in PVFS_sys_testsome (op_id_array=0xbf8522a8, 
> >>> op_count=0xbf8528c4, user_ptr_array=0xbf8527a8, 
> >>> error_code_array=0xbf8526a8, timeout_ms=10) at 
> >>> src/client/sysint/client-state-machine.c:971
> >>> #8  0x08050cd6 in process_vfs_requests () at 
> >>> src/apps/kernel/linux/pvfs2-client-core.c:3119
> >>> #9  0x08052658 in main (argc=10, argv=0xbf852a74) at 
> >>> src/apps/kernel/linux/pvfs2-client-core.c:3579
> >>>
> >>> Let me now if more information is needed, thanks for the input!
> >>>
> >>> Michael
> >>> _______________________________________________
> >>> Pvfs2-developers mailing list
> >>> [email protected]
> >>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
> >> _______________________________________________
> >> Pvfs2-developers mailing list
> >> [email protected]
> >> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
> >>
> >> ------------------------------------------------------------------------
> >>
> >> _______________________________________________
> >> Pvfs2-developers mailing list
> >> [email protected]
> >> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>

Index: src/apps/kernel/linux/pvfs2-client-core.c
===================================================================
RCS file: /projects/cvsroot/pvfs2/src/apps/kernel/linux/pvfs2-client-core.c,v
retrieving revision 1.107
diff -a -u -p -r1.107 pvfs2-client-core.c
--- src/apps/kernel/linux/pvfs2-client-core.c   29 Jan 2010 21:41:34 -0000      
1.107
+++ src/apps/kernel/linux/pvfs2-client-core.c   8 Feb 2010 19:06:45 -0000
@@ -130,9 +130,13 @@ typedef struct
   be serviced by our regular handlers.  to do both, we use a thread
   for the blocking ioctl.
 */
+#define REMOUNT_NOTCOMPLETED    0
+#define REMOUNT_COMPLETED       1
+#define REMOUNT_FAILED          2
 static pthread_t remount_thread;
 static pthread_mutex_t remount_mutex = PTHREAD_MUTEX_INITIALIZER;
-static int remount_complete = 0;
+static int remount_complete = REMOUNT_NOTCOMPLETED;
+
 
 /* used for generating unique dynamic mount point names */
 static int dynamic_mount_id = 1;
@@ -502,12 +506,17 @@ static void *exec_remount(void *ptr)
       will fill in our dynamic mount information by triggering mount
       upcalls for each fs mounted by the kernel at this point
      */
+
+    /* if PINT_dev_remount fails set remount_complete appropriately */
     if (PINT_dev_remount())
     {
         gossip_err("*** Failed to remount filesystems!\n");
+        remount_complete = REMOUNT_FAILED;
+    }
+    else
+    {
+        remount_complete = REMOUNT_COMPLETED;
     }
-
-    remount_complete = 1;
     pthread_mutex_unlock(&remount_mutex);
 
     return NULL;
@@ -2842,7 +2851,7 @@ static inline PVFS_error handle_unexp_vf
         goto repost_op;
     }
 
-    if (!remount_complete &&
+    if (remount_complete == REMOUNT_NOTCOMPLETED &&
         (vfs_request->in_upcall.type != PVFS2_VFS_OP_FS_MOUNT))
     {
         gossip_debug(
@@ -3123,6 +3132,7 @@ static PVFS_error process_vfs_requests(v
         for(i = 0; i < op_count; i++)
         {
             vfs_request = vfs_request_array[i];
+
             assert(vfs_request);
 /*             assert(vfs_request->op_id == op_id_array[i]); */
             if (vfs_request->num_ops == 1 &&
@@ -3269,6 +3279,29 @@ static PVFS_error process_vfs_requests(v
                 vfs_request, "normal_completion");
             assert(ret == 0);
         }
+
+        /* The status of the remount thread needs to be checked in the event 
+         * the remount fails on client-core startup. If this is the initial 
+         * startup then any mount requests will fail as expected and the 
+         * client-core will behave normally. However, if a mount was 
+         * previously successful (in a previous client-core incarnation) 
+         * client-core doesn't check if the remount succeeded before 
+         * handling the mount request and fs_add. Then any subsequent requests
+         * cause this thread spins around PINT_dev_test_unexpected.
+         *
+         * With the current structure of process_vfs_request, creating the 
+         * remount thread before entering the while loop, it seems exiting 
+         * client-core on a failed remount attempt is the most staight forward 
+         * way to handle this case. Exiting will cause the parent to kickoff 
+         * another client-core and try the remount until it succeeds.
+         */
+        if( remount_complete == REMOUNT_FAILED )
+        {
+            gossip_debug(GOSSIP_CLIENTCORE_DEBUG,
+                         "%s: remount not completed successfully, no longer "
+                         "handling requests.\n", __func__);
+            return -PVFS_EAGAIN; 
+        }
     }
 
     gossip_debug(GOSSIP_CLIENTCORE_DEBUG,
@@ -3583,7 +3616,7 @@ int main(int argc, char **argv)
     }
 
     /* join remount thread; should be long done by now */
-    if (remount_complete)
+    if (remount_complete == REMOUNT_COMPLETED )
     {
         pthread_join(remount_thread, NULL);
     }
@@ -3619,6 +3652,12 @@ int main(int argc, char **argv)
         return 1;
     }
 
+    /* if failed remount tell the parent it's something we did wrong. */
+    if( remount_complete != REMOUNT_COMPLETED )
+    {
+        gossip_err("exec_remount failed\n");
+        return 1;
+    }
     /* forward the signal on to the parent */
     if(s_client_signal)
     {

_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] pvfs2-client-core behavior on failed remount

Reply via email to