Jim:

Next time this happens, can you attach to the pvfs2-client-core process
using gdb and see if you can tell in which function it seems to spinning?
Also, you can try turning on client debugging, so we can see what the
client core is doing.  To turn on debugging dynamically, issue the
following:

echo "all" > /proc/sys/pvfs2/client-debug

With the CPU so high, the client-core may or may not see the change in
gossip_debug settings.  If it does, then a lot of output will be
generated!  Before you reboot your system, make a copy of the client log
and send that to me, along with any information you might get from gdb.

When you can, please try using 2.8.6 on your head node and see if you can
reproduce the problem.

Thanks,
Becky

On Tue, Jul 31, 2012 at 12:45 PM, Jim Kusznir <[email protected]> wrote:

> Unfortunately, the pvfs2-client.log is truncated and reopened on
> reboot (eg, all entries are lost).  Already checked.  Also, I didn't
> see anything in /var/log/messages (I looked there when the problem
> started mounting).  There appears to be no "paper trail" of this
> incident, which is why its been so hard to track down.
>
> --Jim
>
> On Mon, Jul 30, 2012 at 1:18 PM, Becky Ligon <[email protected]> wrote:
> > Jim:
> >
> > Please send the pvfs2-client.log from your head node and the
> > /var/log/messages just before you rebooted.  I'm thinking that the high
> CPU
> > utilization is coming from a failed operation that wasn't cleaned up
> > properly.
> >
> > As I noted in my previous email, 2.8.6 addressed some of these high CPU
> > utilization issues.  It would be worth while for you to apply 2.8.6 to
> your
> > head node and see if this particular situation comes up again.
> >
> > Becky
> >
> >
> > On Mon, Jul 30, 2012 at 3:03 PM, Jim Kusznir <[email protected]> wrote:
> >>
> >> I think I caught a pvfs2-induced crash in progress on 2.8.5.  I don't
> >> have a crash file, and it looks like its still in the process of
> >> bringing down my head node.  Symptoms were:
> >>
> >> Someone was doing an scp from (or to, not sure which, but probably
> >> from) the pvfs2 volume.  At some point, CPU usage spikes on the head
> >> node.  Top shows both the scp and the pvfs2-client-core using 100% of
> >> a core.  The load avg just keeps going up and up.  About 29, I lost
> >> responsiveness from the server.  CPU load shows 62.5% iowait, 25%
> >> system, 12.5% idle, all others 0.  The only processes of note running
> >> is the one SCP and the pvfs2 process.
> >>
> >>
> >> My machine has now gone unresponsive; I'll probably need to go hit the
> >> front panel reset button.  When it comes back up, I doubt there will
> >> be any written logs of what happened.  Hence, why I can never catch
> >> the logs of the crash; it *thinks* its working until the system goes
> >> non-responsive and resets.
> >>
> >> --Jim
> >
> >
> >
> >
> > --
> > Becky Ligon
> > OrangeFS Support and Development
> > Omnibond Systems
> > Anderson, South Carolina
> >
> >
>



-- 
Becky Ligon
OrangeFS Support and Development
Omnibond Systems
Anderson, South Carolina
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to