Re: [openstack-dev] [tripleo] rh1 issues post-mortem

2017-06-02 Thread Wesley Hayutin
On Fri, Jun 2, 2017 at 4:42 PM, Ben Nemec wrote: > > > On 03/28/2017 05:01 PM, Ben Nemec wrote: > >> Final (hopefully) update: >> >> All active compute nodes have been rebooted and things seem to be stable >> again. Jobs are even running a little faster, so I'm thinking

Re: [openstack-dev] [tripleo] rh1 issues post-mortem

2017-06-02 Thread Ben Nemec
On 03/28/2017 05:01 PM, Ben Nemec wrote: Final (hopefully) update: All active compute nodes have been rebooted and things seem to be stable again. Jobs are even running a little faster, so I'm thinking this had a detrimental effect on performance too. I've set a reminder for about two

Re: [openstack-dev] [tripleo] rh1 issues post-mortem

2017-03-28 Thread Ben Nemec
Final (hopefully) update: All active compute nodes have been rebooted and things seem to be stable again. Jobs are even running a little faster, so I'm thinking this had a detrimental effect on performance too. I've set a reminder for about two months from now to reboot again if we're still

Re: [openstack-dev] [tripleo] rh1 issues post-mortem

2017-03-24 Thread Ben Nemec
To follow-up on this, we've continued to hit this issue on other compute nodes. Not surprising, of course. They've all been up for about the same period of time and have had largely even workloads. It has caused problems though because it is cropping up faster than I can respond (it takes a

Re: [openstack-dev] [tripleo] rh1 issues post-mortem

2017-03-24 Thread Derek Higgins
On 22 March 2017 at 22:36, Ben Nemec wrote: > Hi all (owl?), > > You may have missed it in all the ci excitement the past couple of days, but > we had a partial outage of rh1 last night. It turns out the OVS port issue > Derek discussed in >