Re: [openstack-dev] [tripleo] Adding hp1 back running tripleo CI

2014-09-17 Thread Derek Higgins
On 15/09/14 22:37, Gregory Haynes wrote:
 This is a total shot in the dark, but a couple of us ran into issues
 with the Ubuntu Trusty kernel (I know I hit it on HP hardware) that was
 causing severely degraded performance for TripleO. This fixed with a
 recently released kernel in Trusty... maybe you could be running into
 this?

thanks Greg,

To try this out, I've redeployed the new testenv image and ran 35
overcloud jobs on it(32 passed), the average time for these was 130
minutes so unfortunately no major difference.

The old kernel was
3.13.0-33-generic #58-Ubuntu SMP Tue Jul 29 16:45:05 UTC 2014 x86_64
the one one is
3.13.0-35-generic #62-Ubuntu SMP Fri Aug 15 01:58:42 UTC 2014 x86_64

Derek

 
 -Greg
 
 Also its worth noting the test I have been using to compare jobs is the
 F20 overcloud job, something has happened recently causing this job to
 run slower then it used to run (possibly upto 30 minutes slower), I'll
 now try to get to the bottom of this. So the times may not end up being
 as high as referenced above but I'm assuming the relative differences
 between the two clouds wont change.

 thoughts?
 Derek

 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] Adding hp1 back running tripleo CI

2014-09-17 Thread Clint Byrum
Excerpts from Derek Higgins's message of 2014-09-17 06:53:25 -0700:
 On 15/09/14 22:37, Gregory Haynes wrote:
  This is a total shot in the dark, but a couple of us ran into issues
  with the Ubuntu Trusty kernel (I know I hit it on HP hardware) that was
  causing severely degraded performance for TripleO. This fixed with a
  recently released kernel in Trusty... maybe you could be running into
  this?
 
 thanks Greg,
 
 To try this out, I've redeployed the new testenv image and ran 35
 overcloud jobs on it(32 passed), the average time for these was 130
 minutes so unfortunately no major difference.
 
 The old kernel was
 3.13.0-33-generic #58-Ubuntu SMP Tue Jul 29 16:45:05 UTC 2014 x86_64

This kernel definitely had the kvm bugs Greg and I exprienced in the
past

 the one one is
 3.13.0-35-generic #62-Ubuntu SMP Fri Aug 15 01:58:42 UTC 2014 x86_64
 

Darn. This one does not. Is it possible the hardware is just less
powerful?

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo] Adding hp1 back running tripleo CI

2014-09-15 Thread Derek Higgins
I've been running overcloud CI tests on hp1 to establish if its ready to
turn back on running real CI, I'd like to add this back in soon but
first have some numbers we should look at and make some decisions

The hp1 cloud throws up more false negatives then rh1, nearly all of
these are either problems within the nova bm driver or the neutron l3
agent, things improve from a pass rate of somewhere around 40% to about
85% with the following 2 patches
https://review.openstack.org/#/c/121492/ # ensure l3 agent doesn't fail
if neutron-server isn't ready
https://review.openstack.org/#/c/121155/ # Increase sleep times in
nova-bm driver

With these 2 patches I think the pass rate is acceptable but there is a
difference in runtime, overcloud jobs run in about 140 minutes (rh1 is
averaging about 95 minues)

We are using VM's with 2G of memory, with 3G VM's the runtime goes down
to about 120 minutes, this is an option to save a little time but we end
up loosing 33% of our capacity (in simultanious jobs)

How would people feel about turning back on hp1 and increasing the
timeout to allow for the increased runtimes?

While making changes we should also consider increasing switching back
to x86_64 and bumping VM's to 4G essentially halving the number of jobs
we can simultaneously run, but CI would test what most deployments would
actually be using.

Also its worth noting the test I have been using to compare jobs is the
F20 overcloud job, something has happened recently causing this job to
run slower then it used to run (possibly upto 30 minutes slower), I'll
now try to get to the bottom of this. So the times may not end up being
as high as referenced above but I'm assuming the relative differences
between the two clouds wont change.

thoughts?
Derek

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] Adding hp1 back running tripleo CI

2014-09-15 Thread Gregory Haynes
This is a total shot in the dark, but a couple of us ran into issues
with the Ubuntu Trusty kernel (I know I hit it on HP hardware) that was
causing severely degraded performance for TripleO. This fixed with a
recently released kernel in Trusty... maybe you could be running into
this?

-Greg

 Also its worth noting the test I have been using to compare jobs is the
 F20 overcloud job, something has happened recently causing this job to
 run slower then it used to run (possibly upto 30 minutes slower), I'll
 now try to get to the bottom of this. So the times may not end up being
 as high as referenced above but I'm assuming the relative differences
 between the two clouds wont change.
 
 thoughts?
 Derek
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev