Working off of release 2.6.2, I found a reproducible segfault in
ib_close_connection
by doing `pvfs2-ls` (ppc64-openib) hardware appears to be functioning
properly, and have reproduced on both eHCA and Mellanox cards. I'm doing
netpipe over ib right now.
heres the backtrace:
[E 15:53:25.692442] Warning: exchange_data: partial read, 1/4 bytes.
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 4398046676096 (LWP 25430)]
0x00000000100b441c in ib_close_connection (c=0x10129790)
at src/io/bmi/bmi_ib/ib.c:1613
1613 ibmap = c->remote_map->method_data;
(gdb) bt
#0 0x00000000100b441c in ib_close_connection (c=0x10129790)
at src/io/bmi/bmi_ib/ib.c:1613
#1 0x00000000100b41f4 in ib_new_connection (sock=10,
peername=0xfffffcc4e9c "da5:3336", is_server=0)
at src/io/bmi/bmi_ib/ib.c:1583
#2 0x00000000100b46c8 in ib_tcp_client_connect (ibmap=0x10128f20,
remote_map=0x10128f10) at src/io/bmi/bmi_ib/ib.c:1657
#3 0x00000000100b10b4 in ensure_connected (remote_map=0x10128f10)
at src/io/bmi/bmi_ib/ib.c:692
#4 0x00000000100b1950 in generic_post_recv (id=0x10129180,
remote_map=0x10128f10, numbufs=0, buffers=0xfffffcc5900,
sizes=0xfffffcc5908, tot_expected_len=32808, tag=1, user_ptr=0x10129158,
context_id=0) at src/io/bmi/bmi_ib/ib.c:852
#5 0x00000000100b208c in BMI_ib_post_recv (id=0x10129180,
remote_map=0x10128f10, buffer=0x10135b60, expected_len=32808,
actual_len=0x10129190, buffer_flag=BMI_PRE_ALLOC, tag=1,
user_ptr=0x10129158, context_id=0) at src/io/bmi/bmi_ib/ib.c:971
#6 0x0000000010063c34 in BMI_post_recv (id=0x10129180, src=2,
buffer=0x10135b60, expected_size=32808, actual_size=0x10129190,
buffer_type=BMI_PRE_ALLOC, tag=1, user_ptr=0x10129158, context_id=0)
at src/io/bmi/bmi.c:535
#7 0x0000000010073d60 in job_bmi_recv (addr=2, buffer=0x10135b60,
size=32808, tag=1, buffer_type=BMI_PRE_ALLOC, user_ptr=0x10129ed0,
status_user_tag=0, out_status_p=0x1012a540, id=0x1012a4f8, context_id=1,
timeout_sec=30) at src/io/job/job.c:536
#8 0x000000001005ced0 in msgpairarray_post (sm_p=0x10129ed0,
js_p=0xfffffcc5c78) at msgpairarray.sm:269
#9 0x0000000010012e90 in PINT_state_machine_next (s=0x10129ed0,
r=0xfffffcc5c78) at state-machine-fns.h:158
#10 0x0000000010012a78 in PINT_client_state_machine_post (sm_p=0x10129ed0,
pvfs_sys_op=19, op_id=0xfffffcc5e18, user_ptr=0x0)
at src/client/sysint/client-state-machine.c:312
#11 0x000000001003c144 in PVFS_isys_fs_add (mntent=0x1010e090,
op_id=0xfffffcc5e18, user_ptr=0x0) at fs-add.sm:188
#12 0x000000001003c1b8 in PVFS_sys_fs_add (mntent=0x1010e090) at
fs-add.sm:197
#13 0x000000001000eb24 in main (argc=1, argv=0xfffffcc6c38)
at src/apps/admin/pvfs2-ls.c:779
--
Kyle Schochenmaier
[EMAIL PROTECTED]
Research Assistant, Dr. Brett Bode
AmesLab - US Dept.Energy
Scalable Computing Laboratory
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers