Steve Wise wrote: > On Fri, 2006-06-23 at 18:20 +0530, Pradipta Kumar Banerjee wrote: >> Steve Wise wrote: >>> The goal of adding the return codes was so that the rping program could >>> exit with a status indicating success or failure. Every rping run >>> results in a DISCONNECT event, so I don't think we want to treat that >>> case as an error. >> DISCONNECT event will be generated when the connection is closed or in case >> of >> some error (like CCAE_LLP_CONNECTION_LOST, CCAE_BAD_CLOSE in case of Ammasso >> driver etc). > > You'll also get the DISCONNECT event when one side finished the rping > loops and does rdma_disconnect(). So receiving that event isn't > necessarily an error... Yes definitely, but this event can _also_ be received due to errors!! > > >>> Also, can you explain why thi fixes Amith's problem, which sounded like >>> a process was hanging? >>> >> On debugging I found that the main thread was blocked in ibv_destroy_cq(), >> cm_thread was blocked in rdma_get_cm_event->write() and cq_thread was >> blocked in >> ibv_get_cq_event->read >> Taking the return value of the DISCONNECT event into consideration >> forcefully >> killed the process. >> On delving deeper into this problem, I think that there is more to this >> rping >> hang. Let me work on this further. >> > > I think rping needs some coordination on these threads and when they > should be killed. > Right..
Thanks, Pradipta >> On a related note - I noticed another rping hang in the following case >> - Start the rping as a client without first starting an rping server >> - If you are lucky the first run itself will result in the 'lt-rping' >> process in >> 'D' state. If not repeating the procedure will result in the hang. >> >> This is the o/p. >> >> cq completion failed status 5 >> wait for CONNECTED state 10 >> connect error -1 >> >> Thanks, >> Pradipta. >> >> >>> Thanks, >>> >>> Steve. >>> >>> >>> >>> On Fri, 2006-06-23 at 00:53 +0530, Pradipta Kumar Banerjee wrote: _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
