I've been running overcloud CI tests on hp1 to establish if its ready to turn back on running real CI, I'd like to add this back in soon but first have some numbers we should look at and make some decisions
The hp1 cloud throws up more false negatives then rh1, nearly all of these are either problems within the nova bm driver or the neutron l3 agent, things improve from a pass rate of somewhere around 40% to about 85% with the following 2 patches https://review.openstack.org/#/c/121492/ # ensure l3 agent doesn't fail if neutron-server isn't ready https://review.openstack.org/#/c/121155/ # Increase sleep times in nova-bm driver With these 2 patches I think the pass rate is acceptable but there is a difference in runtime, overcloud jobs run in about 140 minutes (rh1 is averaging about 95 minues) We are using VM's with 2G of memory, with 3G VM's the runtime goes down to about 120 minutes, this is an option to save a little time but we end up loosing 33% of our capacity (in simultanious jobs) How would people feel about turning back on hp1 and increasing the timeout to allow for the increased runtimes? While making changes we should also consider increasing switching back to x86_64 and bumping VM's to 4G essentially halving the number of jobs we can simultaneously run, but CI would test what most deployments would actually be using. Also its worth noting the test I have been using to compare jobs is the F20 overcloud job, something has happened recently causing this job to run slower then it used to run (possibly upto 30 minutes slower), I'll now try to get to the bottom of this. So the times may not end up being as high as referenced above but I'm assuming the relative differences between the two clouds wont change. thoughts? Derek _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev