Attached is the cvs diff with the requested flags. I noticed how useless
the previous patch format I used was when I was applying the cancel I/O
patch :)
It does lead to a pvfs2-client-core restart loop if the connection to
server never comes back. However, the loop will be tempered by the
BMI timeout and retry counts so it should be a reasonably long loop (I
don't recall the defaults off hand, but should only be every couple
minutes).
Michael
On Mon, Feb 08, 2010 at 01:51:16PM -0500, Phil Carns wrote:
> Hi Michael,
>
> Could you regenerate this patch with "diff -Naupr" (or "cvs diff
> -Naup")? The -u in particular makes it a little easier to read/apply.
>
> I think this is the same issue as described in this open trac entry,
> which would be great to knock out:
>
> https://trac.mcs.anl.gov/projects/pvfs/ticket/66
>
> I haven't traced through the code yet to look myself, but is there any
> chance of the pvfs2-client-core getting stuck in a restart loop?
>
> -Phil
>
> Michael Moore wrote:
> > Attached is a patch against head for the issue. The comments largely
> > describe what's going on. If pvfs2-client-core is re-started due to a
> > segfault with a previously mounted PVFS filesystem any requests will
> > cause the process to spin.
> >
> > The patch adds a check at the end of the process_vfs_request
> > while(s_client_is_processing) loop to check if mount_complete is set to
> > failed. If so, it exits pvfs2-client-core with a non-zero value so a new
> > client-core will get restarted and mount/add the filesystem if
> > exec_remount completes successfully. If everything looks okay can you
> > apply it to head?
> >
> > Thanks,
> > Michael
> >
> > On Mon, Feb 01, 2010 at 02:09:46PM -0500, Michael Moore wrote:
> >> On Mon, Feb 01, 2010 at 02:04:22PM -0500, Michael Moore wrote:
> >>> We recently saw some strange behvior in pvfs2-client-core when a server
> >>> goes away
> >>> (via segfault) and the client is unable to re-mount the filesystem. The
> >>> pvfs2-client-core process takes up 100% of a core just spinning on
> >>> process_vfs_request -> PVFS_sys_testsome and subsequent calls. Full
> >>> backtrace
> >>> follows.
> >>>
> >>> In looking at the code in pvfs2-client-core it seems to assume that the
> >>> re-mount
> >>> will always succeed (around line 3579). However, I don't know that it's
> >>> the root
> >>> cause of the issue. I'll continue looking but wondered if anyone had
> >>> ideas on this.
> >>> This appears to be re-creatable by:
> >>> 1) cleanly mounting and using the filesystem for some I/O
> >>> 2) either killing the servers or adding iptables rules to the client to
> >>> reject
> >>> traffic to the server,
> >>> 3) Attemping I/O from the client
> >> I neglected to mention the pvfs2-client-core must be killed after
> >> attempting
> >> I/O traffic to the 'failed' server as I only saw this behavior after the
> >> client
> >> core restarts. I'm still digging into the reason the client core
> >> segfaulted after a
> >> failed I/O flow.
> >>
> >> Michael
> >>
> >>> The operation correctly dies with connection refused but the client
> >>> begins to spin
> >>> taking up CPU.
> >>>
> >>> (gdb) bt
> >>> #0 0x00511402 in __kernel_vsyscall ()
> >>> #1 0x001d8023 in poll () from /lib/libc.so.6
> >>> #2 0x0082ebed in PINT_dev_test_unexpected (incount=5,
> >>> outcount=0xbf84e4f8, info_array=0x8bbbc0, max_idle_time=10) at
> >>> src/io/dev/pint-dev.c:398
> >>> #3 0x00848f50 in PINT_thread_mgr_dev_push (max_idle_time=10) at
> >>> src/io/job/thread-mgr.c:332
> >>> #4 0x00844caf in do_one_work_cycle_all (idle_time_ms=10) at
> >>> src/io/job/job.c:5238
> >>> #5 0x008454a1 in job_testcontext (out_id_array_p=0xbf8515f0,
> >>> inout_count_p=0xbf8521f4, returned_user_ptr_array=0xbf851df4,
> >>> out_status_array_p=0xbf84e5f0, timeout_ms=10, context_id=0) at
> >>> src/io/job/job.c:4273
> >>> #6 0x00857dba in PINT_client_state_machine_testsome
> >>> (op_id_array=0xbf8522a8, op_count=0xbf8528c4, user_ptr_array=0xbf8527a8,
> >>> error_code_array=0xbf8526a8, timeout_ms=10) at
> >>> src/client/sysint/client-state-machine.c:756
> >>> #7 0x00857fb9 in PVFS_sys_testsome (op_id_array=0xbf8522a8,
> >>> op_count=0xbf8528c4, user_ptr_array=0xbf8527a8,
> >>> error_code_array=0xbf8526a8, timeout_ms=10) at
> >>> src/client/sysint/client-state-machine.c:971
> >>> #8 0x08050cd6 in process_vfs_requests () at
> >>> src/apps/kernel/linux/pvfs2-client-core.c:3119
> >>> #9 0x08052658 in main (argc=10, argv=0xbf852a74) at
> >>> src/apps/kernel/linux/pvfs2-client-core.c:3579
> >>>
> >>> Let me now if more information is needed, thanks for the input!
> >>>
> >>> Michael
> >>> _______________________________________________
> >>> Pvfs2-developers mailing list
> >>> [email protected]
> >>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
> >> _______________________________________________
> >> Pvfs2-developers mailing list
> >> [email protected]
> >> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
> >>
> >> ------------------------------------------------------------------------
> >>
> >> _______________________________________________
> >> Pvfs2-developers mailing list
> >> [email protected]
> >> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>
Index: src/apps/kernel/linux/pvfs2-client-core.c
===================================================================
RCS file: /projects/cvsroot/pvfs2/src/apps/kernel/linux/pvfs2-client-core.c,v
retrieving revision 1.107
diff -a -u -p -r1.107 pvfs2-client-core.c
--- src/apps/kernel/linux/pvfs2-client-core.c 29 Jan 2010 21:41:34 -0000
1.107
+++ src/apps/kernel/linux/pvfs2-client-core.c 8 Feb 2010 19:06:45 -0000
@@ -130,9 +130,13 @@ typedef struct
be serviced by our regular handlers. to do both, we use a thread
for the blocking ioctl.
*/
+#define REMOUNT_NOTCOMPLETED 0
+#define REMOUNT_COMPLETED 1
+#define REMOUNT_FAILED 2
static pthread_t remount_thread;
static pthread_mutex_t remount_mutex = PTHREAD_MUTEX_INITIALIZER;
-static int remount_complete = 0;
+static int remount_complete = REMOUNT_NOTCOMPLETED;
+
/* used for generating unique dynamic mount point names */
static int dynamic_mount_id = 1;
@@ -502,12 +506,17 @@ static void *exec_remount(void *ptr)
will fill in our dynamic mount information by triggering mount
upcalls for each fs mounted by the kernel at this point
*/
+
+ /* if PINT_dev_remount fails set remount_complete appropriately */
if (PINT_dev_remount())
{
gossip_err("*** Failed to remount filesystems!\n");
+ remount_complete = REMOUNT_FAILED;
+ }
+ else
+ {
+ remount_complete = REMOUNT_COMPLETED;
}
-
- remount_complete = 1;
pthread_mutex_unlock(&remount_mutex);
return NULL;
@@ -2842,7 +2851,7 @@ static inline PVFS_error handle_unexp_vf
goto repost_op;
}
- if (!remount_complete &&
+ if (remount_complete == REMOUNT_NOTCOMPLETED &&
(vfs_request->in_upcall.type != PVFS2_VFS_OP_FS_MOUNT))
{
gossip_debug(
@@ -3123,6 +3132,7 @@ static PVFS_error process_vfs_requests(v
for(i = 0; i < op_count; i++)
{
vfs_request = vfs_request_array[i];
+
assert(vfs_request);
/* assert(vfs_request->op_id == op_id_array[i]); */
if (vfs_request->num_ops == 1 &&
@@ -3269,6 +3279,29 @@ static PVFS_error process_vfs_requests(v
vfs_request, "normal_completion");
assert(ret == 0);
}
+
+ /* The status of the remount thread needs to be checked in the event
+ * the remount fails on client-core startup. If this is the initial
+ * startup then any mount requests will fail as expected and the
+ * client-core will behave normally. However, if a mount was
+ * previously successful (in a previous client-core incarnation)
+ * client-core doesn't check if the remount succeeded before
+ * handling the mount request and fs_add. Then any subsequent requests
+ * cause this thread spins around PINT_dev_test_unexpected.
+ *
+ * With the current structure of process_vfs_request, creating the
+ * remount thread before entering the while loop, it seems exiting
+ * client-core on a failed remount attempt is the most staight forward
+ * way to handle this case. Exiting will cause the parent to kickoff
+ * another client-core and try the remount until it succeeds.
+ */
+ if( remount_complete == REMOUNT_FAILED )
+ {
+ gossip_debug(GOSSIP_CLIENTCORE_DEBUG,
+ "%s: remount not completed successfully, no longer "
+ "handling requests.\n", __func__);
+ return -PVFS_EAGAIN;
+ }
}
gossip_debug(GOSSIP_CLIENTCORE_DEBUG,
@@ -3583,7 +3616,7 @@ int main(int argc, char **argv)
}
/* join remount thread; should be long done by now */
- if (remount_complete)
+ if (remount_complete == REMOUNT_COMPLETED )
{
pthread_join(remount_thread, NULL);
}
@@ -3619,6 +3652,12 @@ int main(int argc, char **argv)
return 1;
}
+ /* if failed remount tell the parent it's something we did wrong. */
+ if( remount_complete != REMOUNT_COMPLETED )
+ {
+ gossip_err("exec_remount failed\n");
+ return 1;
+ }
/* forward the signal on to the parent */
if(s_client_signal)
{
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers