Jim:

Are you running 2.8.6 on the server and the client?  Or, just  2.8.6 from
the head node?

Can you run a "ls" on their directories that appear to be missing data?
 Can you also run pvfs2-ls on those same directories?  Please send me the
output from both commands.

Thanks,
Becky

On Wed, Aug 1, 2012 at 8:07 PM, Jim Kusznir <[email protected]> wrote:

> So, since switching over to 2.8.6, I've had two users report that
> their larger directories are missing files / data.
>
> Now I'm really in for it....I'm asking for more details, but I'll need
> to address this pretty thoroughly and rapidly...File systems that
> loose user data are not useful.
>
> --Jim
>
> On Tue, Jul 31, 2012 at 12:36 PM, Becky Ligon <[email protected]> wrote:
> > Jim:
> >
> > The documentation link that I sent doesn't seem to work.  Instead:
> >
> > go to www.orangefs.org and click on the html link for the install guide,
> > about midway down the page.
> >
> > the install guide has a section on setting up a client and in section
> 3.3 is
> > the description of the pvfs2tab file.
> >
> > Becky
> >
> >
> > On Tue, Jul 31, 2012 at 3:26 PM, Becky Ligon <[email protected]> wrote:
> >>
> >> Jim:
> >>
> >> To generate a new config file, issue the command:
> >>
> >> /opt/pvfs2/bin/pvfs2-genconfig <config file name>
> >>
> >> You will be asked a set of questions regarding your installation.  This
> >> utility may not provide everything you need, just depends on your
> setup.  To
> >> help you, I will forward you a copy of our production conf file.  You
> can
> >> compare it to your own needs and modify the new conf file as needed.
>  After
> >> you create a new conf file, I would be happy to review it for you.
> >>
> >> I'm not sure how your clients have started without a proper pvfs2tab
> file,
> >> unless you have the appropriate info in your fstab file.  The mount info
> >> could be in either file.  I will send you a copy of our production
> pvfs2tab
> >> file as an example.
> >>
> >> The link below will describe how to create the entries in the
> >> pvfs2tab/fstab file.
> >>
> >>
> >>
> http://www.pvfs.org/cvs/pvfs-2-8-branch-docs/doc//pvfs2-quickstart/pvfs2quickstart.php#subsec:client
> >>
> >> Thanks for giving 2.8.6 a try!  Let me know how it goes!
> >>
> >> Becky
> >>
> >>
> >>
> >> On Tue, Jul 31, 2012 at 2:14 PM, Jim Kusznir <[email protected]>
> wrote:
> >>>
> >>> I've got 2.8.6 ready to install, but I've got 15 users on there and a
> >>> full cluster at the moment, so I can't intentionally reboot it.  If it
> >>> crashes on me today, I'll take the opportunity to update everything as
> >>> soon as it comes back and reboot it again.  Otherwise, I'll try early
> >>> tomorrow morning to load and reboot.
> >>>
> >>> Also, you previously mentioned my pvfs2 server configuration file
> >>> format was out of date.  Can you suggest a new config file format to
> >>> use based on what I gave you?  Also, I've never had a pvfs2tab file on
> >>> my clients, and my attempts to create one so far have failed.  It
> >>> seems I don't know the proper syntax, and I haven't found a
> >>> sufficiently clear documentation on that either.  It has worked for ~4
> >>> years without one, but...
> >>>
> >>> --Jim
> >>>
> >>> On Tue, Jul 31, 2012 at 10:03 AM, Becky Ligon <[email protected]>
> wrote:
> >>> > Jim:
> >>> >
> >>> > Next time this happens, can you attach to the pvfs2-client-core
> process
> >>> > using gdb and see if you can tell in which function it seems to
> >>> > spinning?
> >>> > Also, you can try turning on client debugging, so we can see what the
> >>> > client
> >>> > core is doing.  To turn on debugging dynamically, issue the
> following:
> >>> >
> >>> > echo "all" > /proc/sys/pvfs2/client-debug
> >>> >
> >>> > With the CPU so high, the client-core may or may not see the change
> in
> >>> > gossip_debug settings.  If it does, then a lot of output will be
> >>> > generated!
> >>> > Before you reboot your system, make a copy of the client log and send
> >>> > that
> >>> > to me, along with any information you might get from gdb.
> >>> >
> >>> > When you can, please try using 2.8.6 on your head node and see if you
> >>> > can
> >>> > reproduce the problem.
> >>> >
> >>> > Thanks,
> >>> > Becky
> >>> >
> >>> >
> >>> > On Tue, Jul 31, 2012 at 12:45 PM, Jim Kusznir <[email protected]>
> >>> > wrote:
> >>> >>
> >>> >> Unfortunately, the pvfs2-client.log is truncated and reopened on
> >>> >> reboot (eg, all entries are lost).  Already checked.  Also, I didn't
> >>> >> see anything in /var/log/messages (I looked there when the problem
> >>> >> started mounting).  There appears to be no "paper trail" of this
> >>> >> incident, which is why its been so hard to track down.
> >>> >>
> >>> >> --Jim
> >>> >>
> >>> >> On Mon, Jul 30, 2012 at 1:18 PM, Becky Ligon <[email protected]>
> >>> >> wrote:
> >>> >> > Jim:
> >>> >> >
> >>> >> > Please send the pvfs2-client.log from your head node and the
> >>> >> > /var/log/messages just before you rebooted.  I'm thinking that the
> >>> >> > high
> >>> >> > CPU
> >>> >> > utilization is coming from a failed operation that wasn't cleaned
> up
> >>> >> > properly.
> >>> >> >
> >>> >> > As I noted in my previous email, 2.8.6 addressed some of these
> high
> >>> >> > CPU
> >>> >> > utilization issues.  It would be worth while for you to apply
> 2.8.6
> >>> >> > to
> >>> >> > your
> >>> >> > head node and see if this particular situation comes up again.
> >>> >> >
> >>> >> > Becky
> >>> >> >
> >>> >> >
> >>> >> > On Mon, Jul 30, 2012 at 3:03 PM, Jim Kusznir <[email protected]>
> >>> >> > wrote:
> >>> >> >>
> >>> >> >> I think I caught a pvfs2-induced crash in progress on 2.8.5.  I
> >>> >> >> don't
> >>> >> >> have a crash file, and it looks like its still in the process of
> >>> >> >> bringing down my head node.  Symptoms were:
> >>> >> >>
> >>> >> >> Someone was doing an scp from (or to, not sure which, but
> probably
> >>> >> >> from) the pvfs2 volume.  At some point, CPU usage spikes on the
> >>> >> >> head
> >>> >> >> node.  Top shows both the scp and the pvfs2-client-core using
> 100%
> >>> >> >> of
> >>> >> >> a core.  The load avg just keeps going up and up.  About 29, I
> lost
> >>> >> >> responsiveness from the server.  CPU load shows 62.5% iowait, 25%
> >>> >> >> system, 12.5% idle, all others 0.  The only processes of note
> >>> >> >> running
> >>> >> >> is the one SCP and the pvfs2 process.
> >>> >> >>
> >>> >> >>
> >>> >> >> My machine has now gone unresponsive; I'll probably need to go
> hit
> >>> >> >> the
> >>> >> >> front panel reset button.  When it comes back up, I doubt there
> >>> >> >> will
> >>> >> >> be any written logs of what happened.  Hence, why I can never
> catch
> >>> >> >> the logs of the crash; it *thinks* its working until the system
> >>> >> >> goes
> >>> >> >> non-responsive and resets.
> >>> >> >>
> >>> >> >> --Jim
> >>> >> >
> >>> >> >
> >>> >> >
> >>> >> >
> >>> >> > --
> >>> >> > Becky Ligon
> >>> >> > OrangeFS Support and Development
> >>> >> > Omnibond Systems
> >>> >> > Anderson, South Carolina
> >>> >> >
> >>> >> >
> >>> >
> >>> >
> >>> >
> >>> >
> >>> > --
> >>> > Becky Ligon
> >>> > OrangeFS Support and Development
> >>> > Omnibond Systems
> >>> > Anderson, South Carolina
> >>> >
> >>> >
> >>
> >>
> >>
> >>
> >> --
> >> Becky Ligon
> >> OrangeFS Support and Development
> >> Omnibond Systems
> >> Anderson, South Carolina
> >>
> >>
> >
> >
> >
> > --
> > Becky Ligon
> > OrangeFS Support and Development
> > Omnibond Systems
> > Anderson, South Carolina
> >
> >
>



-- 
Becky Ligon
OrangeFS Support and Development
Omnibond Systems
Anderson, South Carolina
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to