Hey Phil,

Unfortunately, I didn't have any luck with the patch. I didn't get a core
file this time, but one of the daemons quit responding. I was able to run a
ping and statfs again, but as soon as I tried to write that file, the server
stalled. What other information can I get you?

Bart.



On Thu, May 13, 2010 at 12:14 PM, Phil Carns <[email protected]> wrote:

>  Hey Bart,
>
> I haven't really tested this change yet, but can you try the attached patch
> and see if that seems to solve the problem?  I think this is follow on to
> the same bug you guys reported earlier.   I just missed another race issue
> caused by the last patch.
>
> -Phil
>
>
> On 05/12/2010 05:18 PM, Bart Taylor wrote:
>
> Hey guys,
>
> I have a 3 node local disk file system that had a core dump during some
> testing. It is an upgraded fs from 2.6 to 2.8.2. After the upgrade, I ran a
> couple of utilities like pvfs2-ping and pvfs2-statfs. After those succeeded,
> I attempted to create a new file of around 800K, and the first server died.
> There wasn't anything useful in the logs or dmesg. Below is a backtrace from
> the core file. I can supply the entire file, but I can't email it at 43M.
>
> This may be related to the precreate-pool-race patch from a few days ago
> since the backtrace indicates it was in the vicinity of those code changes.
>
> Let me know what else I can supply that will help.
>
> Bart.
>
>
>
>
> (gdb) bt
> #0  0x009e37a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
> #1  0x00a24825 in raise () from /lib/tls/libc.so.6
> #2  0x00a26289 in abort () from /lib/tls/libc.so.6
> #3  0x00a58d2a in __libc_message () from /lib/tls/libc.so.6
> #4  0x00a5f72f in _int_free () from /lib/tls/libc.so.6
> #5  0x00a5fbaa in free () from /lib/tls/libc.so.6
> #6  0x0807d6e5 in precreate_pool_get_thread_mgr_callback_unlocked
> (data=0xb55d30f0, error_code=0) at ../pvfs2_src/src/io/job/job.c:4456
> #7  0x0807fd3d in precreate_pool_get_handles_try_post (jd=0xb55d4110) at
> ../pvfs2_src/src/io/job/job.c:5930
> #8  0x0807f5b9 in job_precreate_pool_get_handles (fsid=140299291, count=2,
> servers=0x0, handle_array=0xb55d41f0, flags=0, user_ptr=0xb5507c98,
>     status_user_tag=0, out_status_p=0x9c23348, id=0xbffc11b0, context_id=0,
> hints=0xb5506a88) at ../pvfs2_src/src/io/job/job.c:5718
> #9  0x0806c3cc in get_handles (smcb=0xb5507c98, js_p=0x9c23348) at
> ../pvfs2_src/src/server/unstuff.sm:267
> #10 0x08095e06 in PINT_state_machine_invoke (smcb=0xb5507c98, r=0x9c23348)
> at ../pvfs2_src/src/common/misc/state-machine-fns.c:132
> #11 0x080961c4 in PINT_state_machine_next (smcb=0xb5507c98, r=0x9c23348) at
> ../pvfs2_src/src/common/misc/state-machine-fns.c:309
> #12 0x08096200 in PINT_state_machine_continue (smcb=0xb5507c98,
> r=0x9c23348) at ../pvfs2_src/src/common/misc/state-machine-fns.c:327
> #13 0x0805667c in main (argc=6, argv=0xbffc1334) at
> ../pvfs2_src/src/server/pvfs2-server.c:413
>
>
> _______________________________________________
> Pvfs2-developers mailing list
> [email protected]http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>
>
>
> _______________________________________________
> Pvfs2-developers mailing list
> [email protected]
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>
>
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Reply via email to