No i did not had the chance to test it. I will test it later. I also run pvfs2-cp with valgrind and that's what i got:......Syscall param writev(vector[...]) points to uninitialised byte(s)==31882== at 0x4EE368: writev (in /lib/libc-2.5.so)==31882== by 0x80AE49E: BMI_sockio_nbvector (sockio.c:303)==31882== by 0x80AA658: payload_progress (bmi-tcp.c:4019)==31882== by 0x80AB1D0: tcp_post_send_generic (bmi-tcp.c:3886)==31882== by 0x80AB51E: BMI_tcp_post_sendunexpected_list (bmi-tcp.c:1513)==31882== by 0x80A6C8E: job_bmi_send_list (job.c:512)==31882== by 0x808DBB1: msgpairarray_post (msgpairarray.sm:322)==31882== by 0x8090432: PINT_state_machine_invoke (state-machine-fns.c:132)==31882== by 0x809077C: PINT_state_machine_next (state-machine-fns.c:309)==31882== by 0x8090300: PINT_state_machine_continue (state-machine-fns.c:327)==31882== by 0x8056FCB: PINT_client_state_machine_test (client-state-machine.c:662)==31882== by 0x8057118: PINT_client_wait_internal (client-state-machine.c:798)==31882== Address 0x474400c is 12 bytes inside a block of size 508 alloc'd==31882== at 0x4004B11: memalign (vg_replace_malloc.c:532)==31882== by 0x4004B6B: posix_memalign (vg_replace_malloc.c:660)==31882== by 0x80AF4BF: PINT_mem_aligned_alloc (pint-mem.c:27)==31882== by 0x8079CC5: encode_common (PINT-le-bytefield.c:333)==31882== by 0x807C851: lebf_encode_req (PINT-le-bytefield.c:369)==31882== by 0x808DDA1: msgpairarray_post (msgpairarray.sm:209)==31882== by 0x8090432: PINT_state_machine_invoke (state-machine-fns.c:132)==31882== by 0x809077C: PINT_state_machine_next (state-machine-fns.c:309)==31882== by 0x8090300: PINT_state_machine_continue (state-machine-fns.c:327)==31882== by 0x8056FCB: PINT_client_state_machine_test (client-state-machine.c:662)==31882== by 0x8057118: PINT_client_wait_internal (client-state-machine.c:798)==31882== by 0x805720D: PVFS_sys_wait (client-state-machine.c:960)==31882== ==31882== Invalid read of size 4==31882== at 0x80608DA: io_inspect_attr (sys-io.sm:2926)==31882== by 0x8090432: PINT_state_machine_invoke (state-machine-fns.c:132)==31882== by 0x809077C: PINT_state_machine_next (state-machine-fns.c:309)==31882== by 0x8090300: PINT_state_machine_continue (state-machine-fns.c:327)==31882== by 0x8090390: PINT_state_machine_start (state-machine-fns.c:202)==31882== by 0x80577A6: PINT_client_state_machine_post (client-state-machine.c:405)==31882== by 0x80657F9: PVFS_isys_io (sys-io.sm:346)==31882== by 0x806598B: PVFS_sys_io (sys-io.sm:370)==31882== by 0x8053659: main (pvfs2-cp.c:340)==31882== Address 0x10 is not stack'd, malloc'd or (recently) free'd==31882== ==31882== ==31882== Process terminating with default action of signal 11 (SIGSEGV)==31882== Access not within mapped region at address 0x10==31882== at 0x80608DA: io_inspect_attr (sys-io.sm:2926)==31882== by 0x8090432: PINT_state_machine_invoke (state-machine-fns.c:132)==31882== by 0x809077C: PINT_state_machine_next (state-machine-fns.c:309)==31882== by 0x8090300: PINT_state_machine_continue (state-machine-fns.c:327)==31882== by 0x8090390: PINT_state_machine_start (state-machine-fns.c:202)==31882== by 0x80577A6: PINT_client_state_machine_post (client-state-machine.c:405)==31882== by 0x80657F9: PVFS_isys_io (sys-io.sm:346)==31882== by 0x806598B: PVFS_sys_io (sys-io.sm:370)==31882== by 0x8053659: main (pvfs2-cp.c:340)==31882== If you believe this happened as a result of a stack==31882== overflow in your program's main thread (unlikely but==31882== possible), you can try to increase the size of the==31882== main thread stack using the --main-stacksize= flag.==31882== The main thread stack size used in this run was 10485760.==31882== ==31882== HEAP SUMMARY:==31882== in use at exit: 10,553,111 bytes in 161 blocks==31882== total heap usage: 735 allocs, 574 frees, 83,393,174 bytes allocated==31882== ==31882== LEAK SUMMARY:==31882== definitely lost: 0 bytes in 0 blocks==31882== indirectly lost: 0 bytes in 0 blocks==31882== possibly lost: 2,819 bytes in 32 blocks==31882== still reachable: 10,550,292 bytes in 129 blocks==31882== suppressed: 0 bytes in 0 blocks==31882== Rerun with --leak-check=full to see details of leaked memory==31882== ==31882== For counts of detected and suppressed errors, rerun with: -v==31882== Use --track-origins=yes to see where uninitialised values come from==31882== ERROR SUMMARY: 10 errors from 3 contexts (suppressed: 13 from 8)Segmentation fault
--- Στις Δευτ., 14/02/11, ο/η Michael Moore <[email protected]> έγραψε: Από: Michael Moore <[email protected]> Θέμα: Re: [Pvfs2-developers] pvfs2-cp.c returns segmentation fault Προς: "jon jon" <[email protected]> Κοιν.: "[email protected]" <[email protected]> Ημερομηνία: Δευτέρα, 14 Φεβρουάριος 2011, 16:38 Did you have a chance to test the pvfs2-cp without replication enabled ( maybe on a local filesystem on one metadata server)? I'm just trying to isolate if the replication is the cause or another issue. Maybe someone else on the list has seen this issue before? Michael On Feb 14, 2011, at 9:31 AM, jon jon <[email protected]> wrote: Hi Michael, Yes i am trying to add high availability to the metadata servers using Berkeley db if this is what u ask. -Yiannis --- Στις Δευτ., 14/02/11, ο/η Michael Moore <[email protected]> έγραψε: Από: Michael Moore <[email protected]> Θέμα: Re: [Pvfs2-developers] pvfs2-cp.c returns segmentation fault Προς: "jon jon" <[email protected]> Κοιν.: "[email protected]" <[email protected]> Ημερομηνία: Δευτέρα, 14 Φεβρουάριος 2011, 16:25 Hi Yiannis, What do you mean by using replication? Is the underlying filesystem shared between the two metadata servers? Michael On Feb 14, 2011, at 8:52 AM, jon jon <[email protected]> wrote: hello all, I m using pvfs.2-8-1. I have 1 I/O server and 2 metadata servers using replication. I am on User mode. When i run pvfs2-cp it returns a segmentation fault. The file that i am trying to cp is copied successfully. After using gdb here is what i get: [Thread debugging using libthread_db enabled][New Thread 0xb7ef56c0 (LWP 31833)] Program received signal SIGSEGV, Segmentation fault.io_inspect_attr (smcb=0x9aa6610, js_p=0xbfef0c58) at src/client/sysint/sys-io.sm:2926 backtrace returned: #0 io_inspect_attr (smcb=0x8f96ba8, js_p=0xbfe99f08) at src/client/sysint/sys-io.sm:2926#1 0x08090433 in PINT_state_machine_invoke (smcb=0x8f96ba8, r=0xbfe99f08) at src/common/misc/state-machine-fns.c:132#2 0x0809077d in PINT_state_machine_next (smcb=0x8f96ba8, r=0xbfe99f08) at src/common/misc/state-machine-fns.c:309#3 0x08090301 in PINT_state_machine_continue (smcb=0x8f96ba8, r=0xbfe99f08) at src/common/misc/state-machine-fns.c:327#4 0x08090391 in PINT_state_machine_start (smcb=0x8f96ba8, r=0xbfe99f08) at src/common/misc/state-machine-fns.c:2020x080577a7 in PINT_client_state_machine_post (smcb=0x8f96ba8, op_id=0xbfe99ff0, user_ptr=0x0) at src/client/sysint/client-state-machine.c:405#6 0x080657fa in PVFS_isys_io (ref= {handle = 4611686018427386622, fs_id = 1938003253, __pad1 = 0}, file_req=0x80c8ec0, file_req_offset=0, buffer=0xb75cc008, mem_req=0x8f96968, credentials=0xbfe9a5ec, resp_p=0xbfe9a5e4, io_type=PVFS_IO_READ, op_id=0xbfe99ff0, hints=0x0, user_ptr=0x0) at src/client/sysint/sys-io.sm:346#7 0x0806598c in PVFS_sys_io (ref= {handle = 4611686018427386622, fs_id = 1938003253, __pad1 = 0}, file_req=0x80c8ec0, file_req_offset=0, buffer=0xb75cc008, mem_req=0x8f96968, credentials=0xbfe9a5ec, resp_p=0xbfe9a5e4, io_type=PVFS_IO_READ, hints=0x0) at src/client/sysint/sys-io.sm:370#8 0x0805365a in main (argc=Cannot access memory at address 0x0) at src/apps/admin/pvfs2-cp.c:340 It get a segfault in file sys-io.sm, function :unstuff_neededwhen trying to execute: first_unstuffed_offset = dist_p->methods->next_mapped_offset( dist_p->params, &fake_file_data, 0); Any help or information about what is causing this would be great. -Yiannis _______________________________________________ Pvfs2-developers mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
_______________________________________________ Pvfs2-developers mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
