Jim:

Can you send me the kmod-pvfs2-...rpm?  I'd like to see how its files are
layed out.

Thanks,
Becky

On Sat, Jul 21, 2012 at 4:46 PM, Jim Kusznir <[email protected]> wrote:

> Hi Becky:
>
> Thanks for all your input.  I was on travel and am currently catching
> up on e-mail, so here are answers to your questions:
>
> 1) this problem occurs on both my ROCKS 5.1 (CentOS 5.2) and ROCKS 6
> (CentOS 6.2) clusters identically.
> 2) I can mount manually using the init script.  It just will not run
> on boot.  It tries, but fails with the error message supplied.
> 3) The module is installed with a kmod-pvfs2-... rpm (as is required
> for ROCKS clusters...Any software to be installed on each node needs
> to be its own RPM).  It appears to me that the module is being loaded
> successfully.
> 4) Ok, that sounds plausible.  I'll make those corrections and see if
> that fixes things.
>
> Of course, the mount on boot was one of two show-stopping issues.  The
> second show-stopping issue is how many kernel panics are being caused
> by OrangeFS.  I've been experiencing 3-8 KP's a week on a light to
> moderate load on my cluster (24 nodes + head node, 3 pvfs nodes).
>
> My versions in use are: 2.8.5 (ROCKS 5.1), 2.8.6 (ROCKS 6).  For my
> users, I absolutely must have a "traditional filesystem interface"
> (eg, MPI-IO or pvfs-* commands are not acceptable, they need to work
> on the files like they would for any other filesystem).
>
> --Jim
>
> On Fri, Jul 20, 2012 at 1:45 PM, Becky Ligon <[email protected]> wrote:
> > Jim:
> >
> > In your init script, you need to add the LD_LIBRARY_PATH variable, since
> > your pvfs library is not in a standard location:
> >
> > export LD_LIBRARY_PATH=/opt/pvfs2/lib:$LD_LIBRARY_PATH
> >
> > Remove the LD_PRELOAD.  It is not needed here.
> >
> > Before "modprobe" will work, you have to run the command "depmod" to
> update
> > the modules list.  The "make kmod_install" does not automatically do
> this.
> > NOTE:  if you place the kernel module (pvfs2.ko) somewhere other than
> > /lib/modules/`uname -r`/kernel/fs/pvfs2, then you can't use modprobe to
> load
> > the module.  Instead, use "/sbin/insmod <path>/pvfs2.ko".  If you are
> using
> > the rpm spec that I gave you (and it looks like you are), then pvfs2.ko
> is
> > located in /opt/pvfs2/lib/pvfs2.ko, in which case, you have to use the
> > "insmod" command to load it and the "rmmod" command to unload it.
> >
> > When you issue a "stop", your script does not stop the client nor does it
> > unload the kernel module.  This will cause problems if you issue a
> "start"
> > by starting another pvfs2-client.  I will send you the init script that
> we
> > use here.  Maybe, you can modify it to accommodate your environment.  We
> > have more checks in it than you have in yours.
> >
> > I am not familiar with how PVFS reacts to the "intr" option that you
> specify
> > in the mount command.  What is its purpose?
> >
> > Becky
> >
> >
> > On Fri, Jul 20, 2012 at 3:27 PM, Becky Ligon <[email protected]> wrote:
> >>
> >> Jim:
> >>
> >> I just realized that you have already sent me your init script.  Let me
> >> take a closer look at it.
> >>
> >> Becky
> >>
> >>
> >> On Fri, Jul 20, 2012 at 3:13 PM, Becky Ligon <[email protected]>
> wrote:
> >>>
> >>> Jim:
> >>>
> >>> I have successfully booted my CentOS 6.2 system (using
> >>> 2.6.32-220.13.1.el6.x86_64) and started the PVFS2 server and mounted
> the
> >>> client.  Thus, I can only guess that there is something in your
> environment
> >>> causing the problem.  Is it possible for you to mount the client by
> issuing
> >>> the commands manually once the system is running?  Can you send me a
> copy of
> >>> your startup script for mounting the client from your /etc/init.d
> directory?
> >>>
> >>> Becky
> >>>
> >>>
> >>> On Thu, Jul 19, 2012 at 12:58 PM, Becky Ligon <[email protected]>
> wrote:
> >>>>
> >>>> Jim:
> >>>>
> >>>> I have been able to successfully mount-on-boot on a VM with the
> >>>> 2.6.32-220.13.1.el6.x86_64.  However, I was using the Scientific
> Linux 6
> >>>> distro and NOT CentOS 6.2.  Next, I will try a CentOS 6.2 distro and
> see
> >>>> what happens with it.
> >>>>
> >>>> Becky
> >>>>
> >>>>
> >>>> On Wed, Jul 18, 2012 at 5:14 PM, Becky Ligon <[email protected]>
> wrote:
> >>>>>
> >>>>> Jim:
> >>>>>
> >>>>> Is the mount-on-boot issue just with your CentOS 6.2 environment?  If
> >>>>> so, which version of OrangeFS are you running?
> >>>>>
> >>>>> Becky
> >>>>>
> >>>>>
> >>>>> On Wed, Jul 18, 2012 at 3:28 PM, Jim Kusznir <[email protected]>
> >>>>> wrote:
> >>>>>>
> >>>>>> I cannot reproduce the pvfs2 crash on demand.  I have not yet seen
> it
> >>>>>> on centos 6, but I haven't placed centos6 into production yet.
> >>>>>>
> >>>>>> On my centos5 systems, its not reproducible on demand, but it seems
> to
> >>>>>> happen with moderate file access from a few different processes.
> >>>>>> Sometimes scp'ing files to/from pvfs2 on the head node (which is a
> >>>>>> pvfs2 client) will do it.  This has happened since the beginning of
> >>>>>> pvfs2 for me; on the compute nodes, I'm not sure if there's more
> than
> >>>>>> one process, but since I updated to OrangeFS 2.8.5, I've been seeing
> >>>>>> compute nodes KP with the previous screenshot (it did not crash
> (that
> >>>>>> I'm aware of) prior to OrangeFS 2.8.5 on compute nodes).
> >>>>>>
> >>>>>> Here's my /etc/init.d/pvfs2-client script:
> >>>>>> ---------------
> >>>>>> #!/bin/sh
> >>>>>> #
> >>>>>> # chkconfig: 2345 99 99
> >>>>>> #
> >>>>>> # description: mount pvfs2 filesystem
> >>>>>> #
> >>>>>>
> >>>>>> . /etc/rc.d/init.d/functions
> >>>>>> #export LD_PRELOAD=/opt/db4/lib/
> >>>>>> case "$1" in
> >>>>>> start)
> >>>>>>         echo -n "Mounting PVFS2 Filesystem: "
> >>>>>>         modprobe pvfs2
> >>>>>>         /opt/pvfs2/sbin/pvfs2-client -p
> >>>>>> /opt/pvfs2/sbin/pvfs2-client-core
> >>>>>>         mkdir -p /mnt/pvfs2
> >>>>>>         mount -t pvfs2 -o intr tcp://pvfs2-io-0-0:3334/pvfs2-fs
> >>>>>> /mnt/pvfs2
> >>>>>>         touch /var/lock/subsys/pvfs2-client
> >>>>>>         ;;
> >>>>>>
> >>>>>> stop)
> >>>>>>         echo -n "Unmounting PVFS2 Filesystem: "
> >>>>>>         umount /mnt/pvfs2
> >>>>>>         rm -f /var/lock/subsys/pvfs2-client
> >>>>>>         ;;
> >>>>>>
> >>>>>> restart)
> >>>>>>         $0 stop
> >>>>>>         $0 start
> >>>>>>         ;;
> >>>>>>
> >>>>>> status)
> >>>>>>         status $NAME
> >>>>>>         ;;
> >>>>>> *)
> >>>>>>         echo "Usage: $NAME {start|stop|restart|status}"
> >>>>>>         exit 1
> >>>>>> esac
> >>>>>>
> >>>>>> exit 0
> >>>>>> ----------------
> >>>>>> I've tried with the export commented and uncommented, no difference.
> >>>>>>
> >>>>>> --Jim
> >>>>>>
> >>>>>> On Wed, Jul 18, 2012 at 12:20 PM, Becky Ligon <[email protected]>
> >>>>>> wrote:
> >>>>>> > Thanks, Jim.
> >>>>>> >
> >>>>>> > We are using 2.6.32-220.4.1.el6.x86_64 in our production
> >>>>>> > environment.  So, I
> >>>>>> > should be able to setup a VM with your kernel version and test.
>  Can
> >>>>>> > you
> >>>>>> > give me a scenario to try in order to reproduce the problem?
> >>>>>> >
> >>>>>> > I am also setting up a CENTOS 6 VM, so I can analyze the
> >>>>>> > mount-with-boot
> >>>>>> > issue.
> >>>>>> >
> >>>>>> > Becky
> >>>>>> >
> >>>>>> >
> >>>>>> > On Wed, Jul 18, 2012 at 3:16 PM, Jim Kusznir <[email protected]>
> >>>>>> > wrote:
> >>>>>> >>
> >>>>>> >> [root@aeoltest torque]# rpm -qa |grep kernel
> >>>>>> >> kernel-2.6.32-220.13.1.el6.x86_64
> >>>>>> >> dracut-kernel-004-256.el6_2.1.noarch
> >>>>>> >> kernel-devel-2.6.32-220.13.1.el6.x86_64
> >>>>>> >> kernel-headers-2.6.32-220.13.1.el6.x86_64
> >>>>>> >> kernel-firmware-2.6.32-220.13.1.el6.noarch
> >>>>>> >> kernel-doc-2.6.32-220.13.1.el6.noarch
> >>>>>> >> [root@aeoltest torque]# uname -a
> >>>>>> >> Linux aeoltest.local 2.6.32-220.13.1.el6.x86_64 #1 SMP Tue Apr 17
> >>>>>> >> 23:56:34 BST 2012 x86_64 x86_64 x86_64 GNU/Linux
> >>>>>> >> [root@aeoltest torque]#
> >>>>>> >>
> >>>>>> >>
> >>>>>> >> On Wed, Jul 18, 2012 at 12:10 PM, Becky Ligon <
> [email protected]>
> >>>>>> >> wrote:
> >>>>>> >> > Jim:
> >>>>>> >> >
> >>>>>> >> > We are working on a few corrections to the user library, as we
> >>>>>> >> > speak,
> >>>>>> >> > that
> >>>>>> >> > were identified last week.  Using LD_PRELOAD would definitely
> get
> >>>>>> >> > around
> >>>>>> >> > the
> >>>>>> >> > kernel issues at hand, but I ask that you wait until we have
> all
> >>>>>> >> > of the
> >>>>>> >> > current corrections in place before using it.
> >>>>>> >> >
> >>>>>> >> > I also have some questions for you.  I am working the issue
> with
> >>>>>> >> > the
> >>>>>> >> > "won't
> >>>>>> >> > mount on boot" issue and would like to know the specific kernel
> >>>>>> >> > that you
> >>>>>> >> > are
> >>>>>> >> > using under CentOS 6.2.
> >>>>>> >> >
> >>>>>> >> > Thanks,
> >>>>>> >> > Becky
> >>>>>> >> >
> >>>>>> >> >
> >>>>>> >> > On Wed, Jul 18, 2012 at 3:01 PM, Jim Kusznir <
> [email protected]>
> >>>>>> >> > wrote:
> >>>>>> >> >>
> >>>>>> >> >> I managed to get a screenshot of a ip-kvm with the last chunk
> of
> >>>>>> >> >> a
> >>>>>> >> >> pvfs-induced KP on a compute node; image attached.
> >>>>>> >> >>
> >>>>>> >> >> With respect to client access methods, perhaps I should switch
> >>>>>> >> >> to a
> >>>>>> >> >> user space solution.  I remember hearing about an LD_Preload
> >>>>>> >> >> client
> >>>>>> >> >> module (not using fuse, but being entirely userspace).  Is
> that
> >>>>>> >> >> "ready" with 2.8.6?  If not, perhaps I need to switch to the
> >>>>>> >> >> fuse
> >>>>>> >> >> module...
> >>>>>> >> >>
> >>>>>> >> >> --Jim
> >>>>>> >> >>
> >>>>>> >> >> On Wed, Jul 18, 2012 at 11:46 AM, Andrew Savchenko
> >>>>>> >> >> <[email protected]>
> >>>>>> >> >> wrote:
> >>>>>> >> >> > Hello Becky,
> >>>>>> >> >> >
> >>>>>> >> >> > On Wed, 18 Jul 2012 12:43:51 -0400 Becky Ligon wrote:
> >>>>>> >> >> >> Andrew:
> >>>>>> >> >> >>
> >>>>>> >> >> >> 2.8.6 does not fix the problem you were seeing with
> question
> >>>>>> >> >> >> marks
> >>>>>> >> >> >> in
> >>>>>> >> >> >> the
> >>>>>> >> >> >> "ls" output, but we are working on it.
> >>>>>> >> >> >>
> >>>>>> >> >> >> Just FYI!
> >>>>>> >> >> >
> >>>>>> >> >> > Thanks for the warning. I'll keep sticking to the fuse
> client
> >>>>>> >> >> > during
> >>>>>> >> >> > update then.
> >>>>>> >> >> >
> >>>>>> >> >> > Best regards,
> >>>>>> >> >> > Andrew Savchenko
> >>>>>> >> >> >
> >>>>>> >> >> > _______________________________________________
> >>>>>> >> >> > Pvfs2-users mailing list
> >>>>>> >> >> > [email protected]
> >>>>>> >> >> >
> >>>>>> >> >> >
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
> >>>>>> >> >> >
> >>>>>> >> >
> >>>>>> >> >
> >>>>>> >> >
> >>>>>> >> >
> >>>>>> >> > --
> >>>>>> >> > Becky Ligon
> >>>>>> >> > OrangeFS Support and Development
> >>>>>> >> > Omnibond Systems
> >>>>>> >> > Anderson, South Carolina
> >>>>>> >> >
> >>>>>> >> >
> >>>>>> >
> >>>>>> >
> >>>>>> >
> >>>>>> >
> >>>>>> > --
> >>>>>> > Becky Ligon
> >>>>>> > OrangeFS Support and Development
> >>>>>> > Omnibond Systems
> >>>>>> > Anderson, South Carolina
> >>>>>> >
> >>>>>> >
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Becky Ligon
> >>>>> OrangeFS Support and Development
> >>>>> Omnibond Systems
> >>>>> Anderson, South Carolina
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Becky Ligon
> >>>> OrangeFS Support and Development
> >>>> Omnibond Systems
> >>>> Anderson, South Carolina
> >>>>
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Becky Ligon
> >>> OrangeFS Support and Development
> >>> Omnibond Systems
> >>> Anderson, South Carolina
> >>>
> >>>
> >>
> >>
> >>
> >> --
> >> Becky Ligon
> >> OrangeFS Support and Development
> >> Omnibond Systems
> >> Anderson, South Carolina
> >>
> >>
> >
> >
> >
> > --
> > Becky Ligon
> > OrangeFS Support and Development
> > Omnibond Systems
> > Anderson, South Carolina
> >
> >
>



-- 
Becky Ligon
OrangeFS Support and Development
Omnibond Systems
Anderson, South Carolina
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to