[EMAIL PROTECTED] wrote on Tue, 30 Jan 2007 16:16 -0600:
> Working off of release 2.6.2, I found a reproducible segfault in
> ib_close_connection
> by doing `pvfs2-ls` (ppc64-openib) hardware appears to be functioning
> properly, and have reproduced on both eHCA and Mellanox cards. I'm doing
> netpipe over ib right now.
>
> heres the backtrace:
>
>
> [E 15:53:25.692442] Warning: exchange_data: partial read, 1/4 bytes.
>
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 4398046676096 (LWP 25430)]
> 0x00000000100b441c in ib_close_connection (c=0x10129790)
> at src/io/bmi/bmi_ib/ib.c:1613
> 1613 ibmap = c->remote_map->method_data;
> (gdb) bt
> #0 0x00000000100b441c in ib_close_connection (c=0x10129790)
> at src/io/bmi/bmi_ib/ib.c:1613
> #1 0x00000000100b41f4 in ib_new_connection (sock=10,
> peername=0xfffffcc4e9c "da5:3336", is_server=0)
> at src/io/bmi/bmi_ib/ib.c:1583
Thanks. This was fixed in head on 29 dec. The 2.6 branch is pretty
old as far as IB goes. I don't have enough discipline to separate
the "fixes" from the "new development" required to maintain a
branch. Here's a bit of a diff. The numbers are probably off as I
cut out just the relevant bits.
Note that your setup still won't work. The server side must not be
running the same version, or is somehow different. This crash
happens after the client realizes it is not getting a good answer
from the server (hence your warning).
Also know that the head has a bunch of nice improvements that should
make it perform better too.
-- Pete
--- src/io/bmi/bmi_ib/ib.c 2007-01-17 12:09:49.000000000 -0500
+++ ../pvfs2/src/io/bmi/bmi_ib/ib.c 2007-01-21 15:56:26.000000000 -0500
@@ -1593,8 +1662,6 @@
*/
static void ib_close_connection(ib_connection_t *c)
{
- ib_method_addr_t *ibmap;
-
debug(2, "%s: closing connection to %s", __func__, c->peername);
c->closed = 1;
if (c->refcnt != 0) {
@@ -1610,8 +1677,10 @@
free(c->eager_recv_buf_head_contig);
/* never free the remote map, for the life of the executable, just
* mark it unconnected since BMI will always have this structure. */
- ibmap = c->remote_map->method_data;
- ibmap->c = NULL;
+ if (c->remote_map) {
+ ib_method_addr_t *ibmap = c->remote_map->method_data;
+ ibmap->c = NULL;
+ }
free(c->peername);
qlist_del(&c->list);
free(c);
@@ -1792,8 +1792,7 @@ static int ib_tcp_server_check_new_conne
c = ib_new_connection(s, peername, 1);
if (!c) {
free(hostname);
- close(s);
- return 0;
+ goto out_unlock;
}
c->remote_map = ib_alloc_method_addr(c, hostname, port);
@@ -1804,12 +1803,12 @@ static int ib_tcp_server_check_new_conne
debug(2, "%s: accepted new connection %s at server", __func__,
c->peername);
+ ret = 1;
+out_unlock:
gen_mutex_unlock(&interface_mutex);
-
if (close(s) < 0)
error_errno("%s: close new sock", __func__);
- ret = 1;
}
return ret;
}
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers