[openstack-dev] [neutron][qa] test_network_basic_ops and the "FloatingIPChecker" control point

Brent Eagles Wed, 18 Dec 2013 19:27:23 -0800

Hi,

Yair and I were discussing a change that I initiated and wasincorporated into the test_network_basic_ops test. It was intended as aconfiguration control point for floating IP address assignments beforeactually testing connectivity. The question we were discussing waswhether this check was a valid pass/fail criteria for tests liketest_network_basic_ops.

The initial motivation for the change was that test_network_basic_opshad a less than 50/50 chance of passing in my local environment forwhatever reason. After looking at the test, it seemed ridiculous that itshould be failing. The problem is that more often than not the data thatwas available in the logs all pointed to it being set up correctly butthe ping test for connectivity was timing out. From the logs it wasn'tclear that the test was failing because neutron did not do the rightthing, did not do it fast enough, or is something else happening? Ofcourse if I paused the test for a short bit between setup and the checksto manually verify everything the checks always passed. So it's a timingissue right?

Two things: adding more timeout to a check is as appealing to me asgargling glass AND I was less "annoyed" that the test was failing as Iwas that it wasn't clear from reading logs what had gone wrong. I triedto find an additional intermediate control point that would "split"failure modes into two categories: neutron is too slow in setting thingsup and neutron failed to set things up correctly. Granted it still isadding timeout to the test, but if I could find a control point based on"settling" so that if it passed, then there is a good chance that if thenext check failed it was because neutron actually screwed up what it wastrying to do.

Waiting until the query on the nova for the floating IP informationseemed a relatively reasonable, if imperfect, "settling" criteria beforeattempting to connect to the VM. Testing to see if the floating IPassignment gets to the nova instance details is a valid test and,AFAICT, missing from the current tests. However, Yair has the reasonablepoint that connectivity is often available long before the floating IPappears in the nova results and that it could be considered invalid touse non-network specific criteria as pass/fail for this test.

In general, the validity of checking for the presence of a floating IPin the server details is a matter of interpretation. I think it is agiven that it must be tested somewhere and that if it causes a test tofail then it is as valid a failure than a ping failing. Certainly I haveseen scenarios where an IP appears, but doesn't actually work and otherswhere the IP doesn't appear (ever, not just in really long while) butmagically works. Both are bugs. Which is more appropriate to tests liketest_network_basic_ops?

Currently, the polling interval for the checks in the gate should betuned. They are borrowing other polling configuration and I can see itis ill-advised. It is currently polling at an interval of a second andif the intent is to wait for the entire system to settle down beforeproceeding then polling nova that quickly is too often. It simplyincreases the load while we are waiting to adapt to a loaded system. Forexample in the course of a three minute timeout, the floating IP checkpolled nova for server details 180 times.

All this aside it is granted that checking for the floating IP in thenova instance details is imperfect in itself. There is nothing thatassures that the presence of that information indicates that thenetworking backend is done its work.


Comments, suggestions, queries, foam bricks?

Cheers,

Brent

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [neutron][qa] test_network_basic_ops and the "FloatingIPChecker" control point

Reply via email to