Correction, I did get a core file this time. I just overlooked it. Backtrace
below.

Bart.


(gdb) bt
#0  0x008247a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x00865825 in raise () from /lib/tls/libc.so.6
#2  0x00867289 in abort () from /lib/tls/libc.so.6
#3  0x00899d2a in __libc_message () from /lib/tls/libc.so.6
#4  0x008a072f in _int_free () from /lib/tls/libc.so.6
#5  0x008a0baa in free () from /lib/tls/libc.so.6
#6  0x0808b73d in precreate_pool_get_thread_mgr_callback_unlocked
(data=0x9f31188,
    error_code=0) at ../pvfs2_src/src/io/job/job.c:4460
#7  0x0808dda7 in precreate_pool_get_handles_try_post (jd=0x9f68ca0)
    at ../pvfs2_src/src/io/job/job.c:5935
#8  0x0808d623 in job_precreate_pool_get_handles (fsid=1141984428, count=2,
servers=0x0,
    handle_array=0x9f35a20, flags=0, user_ptr=0x9f512f0, status_user_tag=0,
    out_status_p=0x9f0c348, id=0xbfebcc00, context_id=0, hints=0x9f4df58)
    at ../pvfs2_src/src/io/job/job.c:5723
#9  0x080c2ad8 in get_handles (smcb=0x9f512f0, js_p=0x9f0c348)
    at ../pvfs2_src/src/server/unstuff.sm:267
#10 0x0807630a in PINT_state_machine_invoke (smcb=0x9f512f0, r=0x9f0c348)
    at ../pvfs2_src/src/common/misc/state-machine-fns.c:132
#11 0x080766c8 in PINT_state_machine_next (smcb=0x9f512f0, r=0x9f0c348)
    at ../pvfs2_src/src/common/misc/state-machine-fns.c:309
#12 0x08076704 in PINT_state_machine_continue (smcb=0x9f512f0, r=0x9f0c348)
    at ../pvfs2_src/src/common/misc/state-machine-fns.c:327
#13 0x0805667c in main (argc=6, argv=0xbfebcd84) at
../pvfs2_src/src/server/pvfs2-server.c:413




On Thu, May 13, 2010 at 2:18 PM, Bart Taylor <[email protected]> wrote:

> Hey Phil,
>
> Unfortunately, I didn't have any luck with the patch. I didn't get a core
> file this time, but one of the daemons quit responding. I was able to run a
> ping and statfs again, but as soon as I tried to write that file, the server
> stalled. What other information can I get you?
>
> Bart.
>
>
>
>
> On Thu, May 13, 2010 at 12:14 PM, Phil Carns <[email protected]> wrote:
>
>>  Hey Bart,
>>
>> I haven't really tested this change yet, but can you try the attached
>> patch and see if that seems to solve the problem?  I think this is follow on
>> to the same bug you guys reported earlier.   I just missed another race
>> issue caused by the last patch.
>>
>> -Phil
>>
>>
>> On 05/12/2010 05:18 PM, Bart Taylor wrote:
>>
>> Hey guys,
>>
>> I have a 3 node local disk file system that had a core dump during some
>> testing. It is an upgraded fs from 2.6 to 2.8.2. After the upgrade, I ran a
>> couple of utilities like pvfs2-ping and pvfs2-statfs. After those succeeded,
>> I attempted to create a new file of around 800K, and the first server died.
>> There wasn't anything useful in the logs or dmesg. Below is a backtrace from
>> the core file. I can supply the entire file, but I can't email it at 43M.
>>
>> This may be related to the precreate-pool-race patch from a few days ago
>> since the backtrace indicates it was in the vicinity of those code changes.
>>
>> Let me know what else I can supply that will help.
>>
>> Bart.
>>
>>
>>
>>
>> (gdb) bt
>> #0  0x009e37a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
>> #1  0x00a24825 in raise () from /lib/tls/libc.so.6
>> #2  0x00a26289 in abort () from /lib/tls/libc.so.6
>> #3  0x00a58d2a in __libc_message () from /lib/tls/libc.so.6
>> #4  0x00a5f72f in _int_free () from /lib/tls/libc.so.6
>> #5  0x00a5fbaa in free () from /lib/tls/libc.so.6
>> #6  0x0807d6e5 in precreate_pool_get_thread_mgr_callback_unlocked
>> (data=0xb55d30f0, error_code=0) at ../pvfs2_src/src/io/job/job.c:4456
>> #7  0x0807fd3d in precreate_pool_get_handles_try_post (jd=0xb55d4110) at
>> ../pvfs2_src/src/io/job/job.c:5930
>> #8  0x0807f5b9 in job_precreate_pool_get_handles (fsid=140299291, count=2,
>> servers=0x0, handle_array=0xb55d41f0, flags=0, user_ptr=0xb5507c98,
>>     status_user_tag=0, out_status_p=0x9c23348, id=0xbffc11b0,
>> context_id=0, hints=0xb5506a88) at ../pvfs2_src/src/io/job/job.c:5718
>> #9  0x0806c3cc in get_handles (smcb=0xb5507c98, js_p=0x9c23348) at
>> ../pvfs2_src/src/server/unstuff.sm:267
>> #10 0x08095e06 in PINT_state_machine_invoke (smcb=0xb5507c98, r=0x9c23348)
>> at ../pvfs2_src/src/common/misc/state-machine-fns.c:132
>> #11 0x080961c4 in PINT_state_machine_next (smcb=0xb5507c98, r=0x9c23348)
>> at ../pvfs2_src/src/common/misc/state-machine-fns.c:309
>> #12 0x08096200 in PINT_state_machine_continue (smcb=0xb5507c98,
>> r=0x9c23348) at ../pvfs2_src/src/common/misc/state-machine-fns.c:327
>> #13 0x0805667c in main (argc=6, argv=0xbffc1334) at
>> ../pvfs2_src/src/server/pvfs2-server.c:413
>>
>>
>> _______________________________________________
>> Pvfs2-developers mailing list
>> [email protected]http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>>
>>
>>
>> _______________________________________________
>> Pvfs2-developers mailing list
>> [email protected]
>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>>
>>
>
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Reply via email to