On Thu, Feb 14, 2008 at 04:33:29PM -0500, Jon M Burgoyne wrote:
> Wondering if anyone has run across this yet (I'm new to the list).  Am running
> pvfs2 server v 2.7.0 on 3 meta/io servers which are RHEL5
> 2.6.18-53.1.6.el5.  These are dual/dual opteron machines with 8G ram per.
> The clients are running the same version, only built on RHEL4 kernel.  The
> filesystem is very large (3 x 8TB ext3s) with a fs.conf file:

Hi 

This configuration *should* work.  About a month ago we had a round of
emails diagnosing some problems with RHEL 4, but that was a differnt
problem -- the kernel module would not get built correctly.

> I have a user that consistently crashes one of the servers (seems random).
> After enabling segv-backtrace, I get the following message:
> 
> [D 02/14 15:48] PVFS2 Server version 2.7.0 starting.
> [E 02/14 15:55] PVFS2 server: signal 11, faulty address is 0x18, from 
> 0x3cb366ee
> f3
> [E 02/14 15:55] [bt] /lib64/libc.so.6 [0x3cb366eef3]
> [E 02/14 15:55] [bt] /lib64/libc.so.6 [0x3cb366eef3]
> [E 02/14 15:55] [bt] /lib64/libc.so.6(cfree+0x8c) [0x3cb3672b1c]
> [E 02/14 15:55] [bt] /usr/sbin/pvfs2-server(job_testcontext+0x13b) [0x432ecb]
> [E 02/14 15:55] [bt] /usr/sbin/pvfs2-server(main+0xdc8) [0x4109f8]
> [E 02/14 15:55] [bt] /lib64/libc.so.6(__libc_start_main+0xf4) [0x3cb361d8a4]
> [E 02/14 15:55] [bt] /usr/sbin/pvfs2-server [0x40e369]
> 
> Does anyone have a clue about this one?

This backtrace is interesting, but without debugging information it's
hard to say what's going on here.  

Can you rebuild the pvfs servers with debugging information?  (Add
'-g' to CFLAGS and re-run configure).   A signal 11 should also end up
easy to diagnose with valgrind.  

Thanks
==rob

-- 
Rob Latham
Mathematics and Computer Science Division    A215 0178 EA2D B059 8CDF
Argonne National Lab, IL USA                 B29D F333 664A 4280 315B
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to