Correction, I did get a core file this time. I just overlooked
it. Backtrace below.
Bart.
(gdb) bt
#0 0x008247a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1 0x00865825 in raise () from /lib/tls/libc.so.6
#2 0x00867289 in abort () from /lib/tls/libc.so.6
#3 0x00899d2a in __libc_message () from /lib/tls/libc.so.6
#4 0x008a072f in _int_free () from /lib/tls/libc.so.6
#5 0x008a0baa in free () from /lib/tls/libc.so.6
#6 0x0808b73d in precreate_pool_get_thread_mgr_callback_unlocked
(data=0x9f31188,
error_code=0) at ../pvfs2_src/src/io/job/job.c:4460
#7 0x0808dda7 in precreate_pool_get_handles_try_post (jd=0x9f68ca0)
at ../pvfs2_src/src/io/job/job.c:5935
#8 0x0808d623 in job_precreate_pool_get_handles
(fsid=1141984428, count=2, servers=0x0,
handle_array=0x9f35a20, flags=0, user_ptr=0x9f512f0,
status_user_tag=0,
out_status_p=0x9f0c348, id=0xbfebcc00, context_id=0,
hints=0x9f4df58)
at ../pvfs2_src/src/io/job/job.c:5723
#9 0x080c2ad8 in get_handles (smcb=0x9f512f0, js_p=0x9f0c348)
at ../pvfs2_src/src/server/unstuff.sm:267 <http://unstuff.sm:267>
#10 0x0807630a in PINT_state_machine_invoke (smcb=0x9f512f0,
r=0x9f0c348)
at ../pvfs2_src/src/common/misc/state-machine-fns.c:132
#11 0x080766c8 in PINT_state_machine_next (smcb=0x9f512f0,
r=0x9f0c348)
at ../pvfs2_src/src/common/misc/state-machine-fns.c:309
#12 0x08076704 in PINT_state_machine_continue (smcb=0x9f512f0,
r=0x9f0c348)
at ../pvfs2_src/src/common/misc/state-machine-fns.c:327
#13 0x0805667c in main (argc=6, argv=0xbfebcd84) at
../pvfs2_src/src/server/pvfs2-server.c:413
On Thu, May 13, 2010 at 2:18 PM, Bart Taylor <[email protected]
<mailto:[email protected]>> wrote:
Hey Phil,
Unfortunately, I didn't have any luck with the patch. I
didn't get a core file this time, but one of the daemons quit
responding. I was able to run a ping and statfs again, but as
soon as I tried to write that file, the server stalled. What
other information can I get you?
Bart.
On Thu, May 13, 2010 at 12:14 PM, Phil Carns
<[email protected] <mailto:[email protected]>> wrote:
Hey Bart,
I haven't really tested this change yet, but can you try
the attached patch and see if that seems to solve the
problem? I think this is follow on to the same bug you
guys reported earlier. I just missed another race issue
caused by the last patch.
-Phil
On 05/12/2010 05:18 PM, Bart Taylor wrote:
Hey guys,
I have a 3 node local disk file system that had a core
dump during some testing. It is an upgraded fs from 2.6
to 2.8.2. After the upgrade, I ran a couple of utilities
like pvfs2-ping and pvfs2-statfs. After those succeeded,
I attempted to create a new file of around 800K, and the
first server died. There wasn't anything useful in the
logs or dmesg. Below is a backtrace from the core file.
I can supply the entire file, but I can't email it at 43M.
This may be related to the precreate-pool-race patch
from a few days ago since the backtrace indicates it was
in the vicinity of those code changes.
Let me know what else I can supply that will help.
Bart.
(gdb) bt
#0 0x009e37a2 in _dl_sysinfo_int80 () from
/lib/ld-linux.so.2
#1 0x00a24825 in raise () from /lib/tls/libc.so.6
#2 0x00a26289 in abort () from /lib/tls/libc.so.6
#3 0x00a58d2a in __libc_message () from /lib/tls/libc.so.6
#4 0x00a5f72f in _int_free () from /lib/tls/libc.so.6
#5 0x00a5fbaa in free () from /lib/tls/libc.so.6
#6 0x0807d6e5 in
precreate_pool_get_thread_mgr_callback_unlocked
(data=0xb55d30f0, error_code=0) at
../pvfs2_src/src/io/job/job.c:4456
#7 0x0807fd3d in precreate_pool_get_handles_try_post
(jd=0xb55d4110) at ../pvfs2_src/src/io/job/job.c:5930
#8 0x0807f5b9 in job_precreate_pool_get_handles
(fsid=140299291, count=2, servers=0x0,
handle_array=0xb55d41f0, flags=0, user_ptr=0xb5507c98,
status_user_tag=0, out_status_p=0x9c23348,
id=0xbffc11b0, context_id=0, hints=0xb5506a88) at
../pvfs2_src/src/io/job/job.c:5718
#9 0x0806c3cc in get_handles (smcb=0xb5507c98,
js_p=0x9c23348) at
../pvfs2_src/src/server/unstuff.sm:267
<http://unstuff.sm:267>
#10 0x08095e06 in PINT_state_machine_invoke
(smcb=0xb5507c98, r=0x9c23348) at
../pvfs2_src/src/common/misc/state-machine-fns.c:132
#11 0x080961c4 in PINT_state_machine_next
(smcb=0xb5507c98, r=0x9c23348) at
../pvfs2_src/src/common/misc/state-machine-fns.c:309
#12 0x08096200 in PINT_state_machine_continue
(smcb=0xb5507c98, r=0x9c23348) at
../pvfs2_src/src/common/misc/state-machine-fns.c:327
#13 0x0805667c in main (argc=6, argv=0xbffc1334) at
../pvfs2_src/src/server/pvfs2-server.c:413
_______________________________________________
Pvfs2-developers mailing list
[email protected]
<mailto:[email protected]>
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
_______________________________________________
Pvfs2-developers mailing list
[email protected]
<mailto:[email protected]>
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers