----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/6396/#review10011 -----------------------------------------------------------
Ship it! Ship It! - Gordon Sim On Aug. 6, 2012, 5:51 p.m., Kenneth Giusti wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/6396/ > ----------------------------------------------------------- > > (Updated Aug. 6, 2012, 5:51 p.m.) > > > Review request for qpid, Alan Conway, Gordon Sim, and Ted Ross. > > > Description > ------- > > Occasionally, the cluster tests will fail in the > test_federation_multilink_failover test with the following error: > > cluster_tests.ShortTests.test_federation_multilink_failover > ................................................................................................................... > fail > Error during test: Traceback (most recent call last): > File > "/home/kgiusti/Desktop/work/qpid/trunk/build/qpid/src/tests/python/commands/qpid-python-test", > line 340, in run > phase() > File > "/home/kgiusti/Desktop/work/qpid/trunk/qpid/cpp/src/tests/cluster_tests.py", > line 992, in test_federation_multilink_failover > assert self._verify_federation(src_cluster[1], "FedX/two", > dst_cluster[1], "destQ2") > AssertionError > > The problem is due to a race condition in the LinkRegistry code. When a new > connection event occurs for a federation Link, the LinkRegistry attempts to > find a Link instance that is attempting to connect to the remote in order to > assign the connection. The problem is due to the fact that the search for > the target link is done under a lock, but the assignment is done outside of > the lock (to prevent lock inversion). > > The proposed fix has LinkRegistry hold all disconnected Links in a separate > container, and perform the search of that container (and the removal on > match) while holding a lock. > > > This addresses bug QPID-4193. > https://issues.apache.org/jira/browse/QPID-4193 > > > Diffs > ----- > > /trunk/qpid/cpp/src/qpid/broker/LinkRegistry.h 1368984 > /trunk/qpid/cpp/src/qpid/broker/LinkRegistry.cpp 1368984 > /trunk/qpid/cpp/src/qpid/cluster/Connection.cpp 1368984 > /trunk/qpid/cpp/src/qpid/cluster/UpdateClient.cpp 1368984 > > Diff: https://reviews.apache.org/r/6396/diff/ > > > Testing > ------- > > Federation and cluster unit tests. > Ran test_federation_multilink_failover repeatedly with no crash. > > > Thanks, > > Kenneth Giusti > >
