I am testing APM with kernel module which directly interfaces with ib_verbs.ko and ib_cm.ko. Yes, I do receive IB_MIG_MIGRATED event, but the QP's mig_state is not actually changed to MIGRATED. So I had to do this from my module.
It could be a bug with ib_cm code, which may not be transitioning the QP state correctly. But the HW may be thinking that it has migrated. I am not sure how exactly ib_cm should notice this event and should should transition the QP state. Any thoughts and suggestions are welcome. I can code it and test it. I don't have the test program which will specifically test this functionality. I am afraid if I can share the whole module. VBabu Jack Morgenstein wrote: >On Tuesday 01 August 2006 05:19, Venkatesh Babu wrote: > > >>Configuration2: Node1 and Node 2 conneected through two switches for >>each port. >> Node1, port1 -> switch1 -> Node2, port1 >> Node1, port2 -> switch2 -> Node2, port2 >> >>Node 1: >>1. Call ib_cm_listen() to wait for connection requests >>2. When a REQ message arrives create a RC QP and establish a connection >>3. Setup callback handlers to receive packets. >>4. Receive packets and verify it and drop it. >>5. Event IB_MIG_MIGRATED received >>6. Stopped receiving packets. >> >>Node 2: >>1. Create RC QP >>2. Send REQ message to Node 1 to establish the connection (Load both >>primary and alternate paths) >>3. Contineously send some packets >>4. Simulate the port failure by unplugging the IB cable >>5. Event IB_MIG_MIGRATED received >> >> But with >>Configuration2, IB_EVENT_PORT_ERR event occurrs on a node1, failover to >>the alternate path doesn't work. The traffic stops. Because node1 >>doesn't now when the IB_EVENT_PORT_ERR event occurred on Node2. >> >> > >We have not seen these problems here. We have regression tests which check >APM, and they have run without problems. These tests have scripts which >bring the HCA port down (equivalent to pulling the cable) to check that the >migration occurs automatically. >(You should NOT need to do ib_modify_qp for the migration to work in the case >of a port error). > >Note, though, that these tests use the ibv_verbs layer directly. We have not >checked out APM over the CM. There may be a bug here regarding setting up >the alternate path properly when creating the connection (although this does >seem strange, since you indicate that the MIGRATED event is received on both >sides!). > >Please send us your test code so that we may reproduce the problem here. > >- Jack > > > > > > > _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
