Steve/Pradipta, Without the -O2 option, rping is now working ! Earlier, I did not realize that the cable was yanked out. Thanks for all the help..
Ravi -----Original Message----- From: Ravinandan Arakali [mailto:[EMAIL PROTECTED] Sent: Friday, July 14, 2006 3:37 PM To: 'Steve Wise' Cc: '[EMAIL PROTECTED]'; '[email protected]'; Leonid. Grossman (E-mail) Subject: RE: [openib-general] ping problem with ammassocards(iWARPinterface) As Pradipta suggested, I rebuilt the libraries by removing the optimization(-O2 flag) from Makefile. Now, I don't see the core dump but there's no connection established with rping. This is similar to the failure I am seeing with rdma_lat test. BTW, when I start the rping in server mode, at say port 9999, should I expect to see an entity listening on that port number when I do "netstat -an". Currently, I don't see that. Ravi -----Original Message----- From: Steve Wise [mailto:[EMAIL PROTECTED] Sent: Thursday, July 13, 2006 12:10 PM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED]; [email protected] Subject: Re: [openib-general] ping problem with ammassocards(iWARPinterface) By the way, does this failure happen immediately or after some period of time? On Thu, 2006-07-13 at 13:27 -0500, Steve Wise wrote: > I guess this isn't surprising since rping doesn't work for you either. > Something fundamental is screwed up on your user side methinks... > > CM event 8 == RDMA_CM_EVENT_REJECTED which means either the server side > wasn't listening on the appropriate TCP port, or the server process did > an rdma_reject(). I'm guessing its the former... > > You could use tcpdmp and to see if the connection request is getting RST > by the remote side. > > > > > On Thu, 2006-07-13 at 11:20 -0700, Ravinandan Arakali wrote: > > With the --cma option, I don't see the error about running SM. > > But there's no connection established. > > > > openfab2:/tmp/ib/src/userspace/perftest # ./rdma_lat --cma > > pp_server_connect_cma starting server > > > > openfab:/tmp/ib/src/userspace/perftest # ./rdma_lat --cma 17.2.2.102 > > pp_client_connect_cma starting client > > pp_client_connect_cma/856 unexpected CM event 8 > > pp_client_connect_cma NOT connected! > > pp_connect_cma(17.2.2.102,18515) failed! > > > > There are no messages in dmesg either. > > > > Ravi > > > > -----Original Message----- > > From: Steve Wise [mailto:[EMAIL PROTECTED] > > Sent: Thursday, July 13, 2006 6:55 AM > > To: Ravinandan Arakali > > Cc: [EMAIL PROTECTED]; [email protected] > > Subject: Re: [openib-general] ping problem with ammasso > > cards(iWARPinterface) > > > > > > Are you trying to run this over iwarp? It doesn't need an SM... > > > > For the perftests rdma_lat and rdma_bw in the iwarp branch, use the > > --cma flag. > > > > Steve. > > > > > > On Wed, 2006-07-12 at 16:39 -0700, Ravinandan Arakali wrote: > > > Also, I am trying to run some of the iwarp bandwidth/latency tests > > > (available under directory perftest). > > > The first thing to do here is to run opensm. When I run opensm (with debug > > > level 10), I get the following error. Any idea what needs to be done to > > get > > > this working ? > > > > > > openfab2:/tmp/ib/src/userspace # opensm -d 10 > > > ------------------------------------------------- > > > OpenSM Rev:openib-1.2.0 > > > Command Line Arguments: > > > d level = 0xa > > > Log File: /var/log/osm.log > > > ------------------------------------------------- > > > OpenSM Rev:openib-1.2.0 > > > > > > Using default GUID 0x0 > > > Error: Could not get port guid > > > Exiting SM > > > > > > openfab2:/tmp/ib/src/userspace # cat /var/log/osm.log > > > Jul 12 08:35:04 718914 [B7E518C0] -> OpenSM Rev:openib-1.2.0 > > > Jul 12 08:35:04 719111 [0000] -> OpenSM Rev:openib-1.2.0 > > > > > > Jul 12 08:35:04 721381 [B7E518C0] -> osm_sa_mad_ctrl_unbind: ERR 1A11: No > > > previous bind > > > Jul 12 08:35:04 721702 [0000] -> Exiting SM > > > > > > > > > > > > > > > > > > -----Original Message----- > > > From: Pradipta Kumar Banerjee [mailto:[EMAIL PROTECTED] > > > Sent: Wednesday, July 12, 2006 10:31 AM > > > To: Ravinandan Arakali > > > Cc: [email protected] > > > Subject: Re: [openib-general] ping problem with ammasso cards(iWARP > > > interface) > > > > > > > > > Ravinandan, > > > Do you still see the rping crash? > > > > > > Thanks, > > > Pradipta Kumar. > > > > > > Ravinandan Arakali wrote: > > > > Pradipta, > > > > Okay, thanks.. Initially, I was not sure since I don't remember non-zero > > > > values in /proc/krping. When I re-ran the krping test, I see following > > > > output > > > > openfab2:~ # cat /proc/krping > > > > 1-amso0 891376 55711 891376 55711 1782720 27855 1782784 27856 > > > > > > > > As you mentioned, the RDMA traffic seems to be flowing indeed ! > > > > Any idea why rping is dumping core ? > > > > > > > > Has any testing been done using SDP with ammasso cards ? > > > > > > > > Regards, > > > > Ravi > > > > > > > > > > > > -----Original Message----- > > > > From: Pradipta Kumar Banerjee [mailto:[EMAIL PROTECTED] > > > > Sent: Friday, July 07, 2006 11:20 PM > > > > To: Ravinandan Arakali > > > > Cc: Leonid. Grossman (E-mail); [EMAIL PROTECTED]; > > > > [email protected] > > > > Subject: Re: [openib-general] ping problem with ammasso cards(iWARP > > > > interface) > > > > > > > > > > > > Ravinandan Arakali wrote: > > > >> Pradipta, > > > >> Following is the output from gdb after core dump. I have also > > copy-pasted > > > >> the gdb output on client system. > > > >> > > > >> Attached is the dmesg output when krping test is run in verbose mode. > > > >> The ping data on the sender(client) seems okay. The content is shifted > > > >> forward by one character for each packet. On receiver, after receiving > > > > ping > > > >> pkt 9, it seems to jump to pkt no. 1935. Not sure if it's because > > > messages > > > >> can be lost during writing to /var/log/messages ? > > > > krping is indeed working!!...Using 'verbose' allows you to see the ping > > > > data. > > > > When not using 'verbose' you see only 'send/recv' messages. > > > >> ----------------------------------------- > > > >> (gdb) run -s -vV -C100 -d -a 0.0.0.0 -p 9999 > > > >> Starting program: > > > >> /tmp/ib/src/userspace/librdmacm/examples/.libs/rping -s -vV -C100 -d -a > > > >> 0.0.0.0 -p 9999 > > > >> [Thread debugging using libthread_db enabled] > > > >> [New Thread -1210054992 (LWP 3668)] > > > >> ipaddr (0.0.0.0) > > > >> port 9999 > > > >> created cm_id 0x804e6e0 > > > >> [New Thread -1210057824 (LWP 3671)] > > > >> rdma_bind_addr successful > > > >> rdma_listen > > > >> cma_event type 4 cma_id 0x804e968 (child) > > > >> child cma 0x804e968 > > > >> > > > >> Program received signal SIGSEGV, Segmentation fault. > > > >> [Switching to Thread -1210054992 (LWP 3668)] > > > >> rping_setup_qp (cb=0x0, cm_id=0x804e968) at examples/rping.c:514 > > > >> 514 cb->pd = ibv_alloc_pd(cm_id->verbs); > > > >> (gdb) bt > > > >> #0 rping_setup_qp (cb=0x0, cm_id=0x804e968) at examples/rping.c:514 > > > >> #1 0x0804a716 in main (argc=9, argv=Cannot access memory at address > > 0x6 > > > >> ) at examples/rping.c:767 > > > >> (gdb) > > > >> > > > >> --------------------------------- > > > >> (gdb) run -c -vV -C100 -d -a 17.2.2.102 -p 9999 > > > >> Starting program: > > > >> > > tmp/ib/src/userspace/librdmacm/examples/.libs/rping -c -vV -C100 -d -a > > > >> 17.2.2.102 -p 9999 > > > >> [Thread debugging using libthread_db enabled] > > > >> [New Thread 47388824908032 (LWP 4620)] > > > >> ipaddr (17.2.2.102) > > > >> port 9999 > > > >> created cm_id 0x506b00 > > > >> [New Thread 1082132800 (LWP 4623)] > > > >> cma_event type 0 cma_id 0x506b00 (parent) > > > >> cma_event type 2 cma_id 0x506b00 (parent) > > > >> rdma_resolve_addr - rdma_resolve_route successful > > > >> created pd 0x506e60 > > > >> created channel 0x506e80 > > > >> created cq 0x506ea0 > > > >> created qp 0x506f40 > > > >> rping_setup_buffers called on cb 0x505010 > > > >> allocated & registered buffers... > > > >> [New Thread 1090525504 (LWP 4624)] > > > >> cq_thread started. > > > >> > > > >> > > > > > > > > > > > > _______________________________________________ > > > > openib-general mailing list > > > > [email protected] > > > > http://openib.org/mailman/listinfo/openib-general > > > > > > > > To unsubscribe, please visit > > > http://openib.org/mailman/listinfo/openib-general > > > > > > > > > > > > > > > > > _______________________________________________ > > > openib-general mailing list > > > [email protected] > > > http://openib.org/mailman/listinfo/openib-general > > > > > > To unsubscribe, please visit > > http://openib.org/mailman/listinfo/openib-general > > > > > > > > _______________________________________________ > openib-general mailing list > [email protected] > http://openib.org/mailman/listinfo/openib-general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
