Hi Mi,

I don't think there have been any applicable commits since 06/28 to
Orange-Branch that would address this issue. Is the panic consistently
reproducible? If so, what workload leads to the panic? Single client with
writes to a single file? I'll look at the logs to see if anything stands
out, otherwise I may need to locally reproduce the issue to track down
what's going on.

Thanks for reporting the issue,
Michael

On Tue, Jul 12, 2011 at 5:43 PM, Mi Zhou <[email protected]> wrote:

> Hi,
>
> I checked out the code from the cvs branch on 6/28, I don't see an
> immediate kernel panic any more, but still got kernel panic after some
> intensive write to the file system (pls see attached screen shot).
>
> This is what is in the server log:
> pvfs02: [E 07/12/2011 16:05:30] Error: encourage_recv_incoming: mop_id
> 17bb01c0 in RTS_DONE message not found.
> pvfs02: [E 07/12/2011 16:05:30]
> [bt] /opt/pvfs/pvfs/sbin/pvfs2-server(error+0xca) [0x449d3a]
> pvfs02: [E 07/12/2011 16:05:30]
> [bt] /opt/pvfs/pvfs/sbin/pvfs2-server [0x446f30]
> pvfs02: [E 07/12/2011 16:05:30]
> [bt] /opt/pvfs/pvfs/sbin/pvfs2-server [0x448a79]
> pvfs02: [E 07/12/2011 16:05:30]
> [bt] /opt/pvfs/pvfs/sbin/pvfs2-server(BMI_testunexpected+0x383)
> [0x445263]
> pvfs02: [E 07/12/2011 16:05:30]
> [bt] /opt/pvfs/pvfs/sbin/pvfs2-server [0x47b27c]
> pvfs02: [E 07/12/2011 16:05:30]         [bt] /lib64/libpthread.so.0
> [0x3ba820673d]
> pvfs02: [E 07/12/2011 16:05:30]         [bt] /lib64/libc.so.6(clone
> +0x6d) [0x3ba7ad44bd]
>
>
>
> And this is the client log:
>
> [E 16:10:17.359727] job_time_mgr_expire: job time out: cancelling bmi
> operation, job_id: 173191.
> [E 16:10:19.371926] Warning: ib_tcp_client_connect: connect to server
> pvfs02:3337: Connection refused.
> [E 16:10:19.371943] Receive immediately failed: Connection refused
> [E 16:10:21.382875] Warning: ib_tcp_client_connect: connect to server
> pvfs02:3337: Connection refused.
> [E 16:10:21.382888] Receive immediately failed: Connection refused
>
> We have pvfs01-03 as running the pvfs-server. Both client and server are
> on centos 5 x63_64, kernel version 2.6.18-238.9.1.el5.
>
> Any advice?
>
> Thanks,
>
> Mi
>
>
> On Thu, 2011-07-07 at 12:33 -0500, Ted Hesselroth wrote:
> > That did resolve the problem. Thanks.
> >
> > On 7/7/2011 11:19 AM, Michael Moore wrote:
> > > Hi Ted,
> > >
> > > There was a regression when adding support for newer kernels that made
> it in
> > > to the 2.8.4 release. I believe that's the issue you're seeing (a
> kernel
> > > panic immediately on modprobe/insmod). The next release will include
> that
> > > fix. Until then, if you can check out the latest version of the code
> from
> > > CVS, it should resolve the issue. The CVS branch is Orange-Branch, full
> > > directions for CVS checkout at http://www.orangefs.org/support/
> > >
> > > We are currently running the kernel module with the latest code on
> CentOS 5
> > > and SL 6 systems. Let me know how it goes.
> > >
> > > For anyone interested, the commit to resolve the issue was:
> > >
> http://www.pvfs.org/fisheye/changelog/~br=Orange-Branch/PVFS/?cs=Orange-Branch:mtmoore:20110530154853
> > >
> > > Michael
> > >
> > >
> > > On Thu, Jul 7, 2011 at 11:36 AM, Ted Hesselroth<[email protected]>  wrote:
> > >
> > >> I have built the kernel module from orangefs-2.8.4 source against a
> 64-bit
> > >> 2.6.18-238.12.1 linux kernel source, and against a 32-bit
> 2.6.18-238.9.1
> > >> source. In both cases, the kernel hung when the module was inserted
> with
> > >> insmod. The first did report "kernel: Oops: 0000 [1] SMP". The
> distributions
> > >> are Scientific Linux 5.x, which is rpm-based and similar to Centos.
> > >>
> > >> Are there kernels for this scenario for which the build is known to
> work?
> > >> The server build and install went fine, but I would like to configure
> some
> > >> clients to access orangefs through a mount point.
> > >>
> > >> Thanks.
> > >>
> > >> ______________________________**_________________
> > >> Pvfs2-users mailing list
> > >> Pvfs2-users@beowulf-**underground.org<
> [email protected]>
> > >> http://www.beowulf-**underground.org/mailman/**listinfo/pvfs2-users<
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users>
> > >>
> > >
> > _______________________________________________
> > Pvfs2-users mailing list
> > [email protected]
> > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
> >
> --
>
> Mi Zhou
> System Integration Engineer
> Information Sciences
> St. Jude Children's Research Hospital
> 262 Danny Thomas Pl. MS 312
> Memphis, TN 38105
> 901.595.5771
>
> Email Disclaimer:  www.stjude.org/emaildisclaimer
>
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to