On Jan 7, 2010, at 11:32 AM, Michael Moore wrote:

> In continuing to look at the 100% CPU usage (kernel loop) Randy had 
> written about previously I've narrowed the issue down a little. It 
> appears related to cancellation of operations when a write() call 
> is blocking and I/O has been retried. 
> 
> While on our cluster the retries were caused by congestion I am 
> re-creating the congestion by killing an I/O server. The test C program 
> I'm using just loops around writes of 4k to a PVFS file. If, 
> while the program is executing, I kill a PVFS I/O server the write hangs 
> (expectedly) . About 30% of the time when I try to kill the 
> process doing the writing it spikes to 100% CPU usage and is not 
> killable. Also, every time I try to kill the writing process 
> pvfs2-client-core segfaults with something similar to:
> 
> [E 11:58:09.724121] PVFS2 client: signal 11, faulty address is 0x41ec, 
> from 0x8050b51
> [E 11:58:09.725403] [bt] pvfs2-client-core [0x8050b51]
> [E 11:58:09.725427] [bt] pvfs2-client-core(main+0xe48) [0x8052498]
> [E 11:58:09.725436] [bt] /lib/libc.so.6(__libc_start_main+0xdc) 
> [0x75ee9c]
> [E 11:58:09.725444] [bt] pvfs2-client-core [0x804a381]
> [E 11:58:09.740133] Child process with pid 2555 was killed by an 
> uncaught signal 6
> 
> In the cases when the CPU usage becomes 100% (and the process can't be 
> terminated) the for() loop in PINT_client_io_cancel strangely segfaults 
> during exactly iteration 31. The value of sm_p->u.io.context_count is 
> in the hunderds so there are a signifigant number of jobs left to cancel.

Hi Michael,

Are you guys using infiniband by chance?  Do you have a stack trace with 
debugging symbols where the pvfs2-client-core segfault occurs?  That might be 
useful for narrowing things down.

> 
> The real issue is the 30% of the time when the process gets stuck in the 
> kernel waiting for a downcall. With some additional debugging, the 
> process's write() call is clearly stuck in the while(1) loop in 
> wait_for_cancellation_downcall(). The function's assumption is that 
> either the request will timeout or it will be serviced after one 
> iteration of the loop. However, in this situation it neither occurs. The 
> schedule_timeout() call immediately returns with a signal pending but 
> the op is never serviced so it spins indefinately.

Those infinite looping conditionals have bugged me for a while now (there's one 
in wait_for_matching_downcall too).  We should probably go through and 
systematically replace all of them.  Lets try to fix your bug first though.

I'm actually surprised that the process goes to 100% cpu in this case.  You're 
right that the op is not getting serviced, so the first if conditional won't 
break out of the while loop.  But the schedule_timeout should only return 
non-zero if the task gets woken up, and it only gets woken up in 
purge_waiting_ops() when the pvfs2-client-core segfaults.  And that should only 
happen once.  One thing you might try is to change the first if conditional 
from:

        if (op_state_serviced(op))

to:

        if (op_state_serviced(op) || op_state_purged(op))

That will allow purged ops to exit the while loop.  Could you share your 
debugging output and modified code?

Thanks,

-sam

> 
> Has anyone else seen the issue with client-core segfaulting on every 
> cancel op? Should the kernel wait_for_cancellation_downcall() be changed 
> to not allow indefinite looping? 
> 
> Thanks,
> Michael
> _______________________________________________
> Pvfs2-developers mailing list
> [email protected]
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers


_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Reply via email to