Re: [openstack-dev] [tripleo] CI is currently down: 2 blockers
For those interested we now have a minimal way to reproduce the MessagingTimeout in Mistral. https://bugs.launchpad.net/mistral/+bug/1624284 It seems to be related to this change in Mistral: https://github.com/openstack/mistral/commit/1b0f0cddd620a3785017bb28d432cb0030b627d7 And even more specifically, this line: https://github.com/openstack/mistral/commit/1b0f0cddd620a3785017bb28d432cb0030b627d7#diff-fa1c08d9053a1e6736fb8ac64e51d1ab Thomas Herve managed to work around it by changing the executor. On 16 September 2016 at 01:19, Emilien Macchi wrote: > So here's an update about current situation: > > Master / Newton > gate-tripleo-ci-centos-7-ovb-nonha > gate-tripleo-ci-centos-7-ovb-ha > The 2 jobs are supposed to pass, but some jobs are timing out in RH1 cloud. > In order to reduce the timeouts, Ben ran: > heat-manage purge_deleted 3 > nova-manage db archive_deleted_rows --verbose --max_rows 100 > sudo mysqlcheck -o -A > > gate-tripleo-ci-centos-7-nonha-multinode > We merged the revert: https://review.openstack.org/#/c/370250/ > At the time I'm writing this email, the job is still non-voting: > https://review.openstack.org/#/c/371133/ > But hopefully Infra will merge this patch soon to bring it back in the > gate. > > > stable/mitaka and stable/liberty > gate-tripleo-ci-centos-7-ovb-nonha works fine. > gate-tripleo-ci-centos-7-ovb-ha is broken because Galera was updated > in EPEL (and TripleO Mitaka still deploys EPEL). > I have 2 patches in order to fix the situation: > 1) Fix Galera configuration to work with recent EPEL (kudos to Damien > for his help): https://review.openstack.org/#/c/371029/ > 2) (not required but good to have) Disable EPEL in tripleoclient > https://review.openstack.org/#/c/369559/ - I would understand if > people -1 this patch and I have no strong opinion about it. > > I hope 1) will pass CI so we can just move forward. > > It's end of day for me but if someone can monitor > http://tripleo.org/cistatus.html during Friday morning and make sure > everything it still running fine, we would appreciate it. Also please > report any bug related to CI and set the ci & alert tags. > > Thanks, and let's keep focusing on Newton release! > > On Thu, Sep 15, 2016 at 11:26 AM, Emilien Macchi > wrote: > > On Wed, Sep 14, 2016 at 10:13 PM, Emilien Macchi > wrote: > >> Hi, > >> > >> Just a heads-up before end of day: > >> > >> 1) multinode job is failing 80% of time. James and myself did some > >> attempts to revert or fix things but we have been unfortunate until > >> now. > >> Everything is documented here: https://bugs.launchpad.net/ > tripleo/+bug/1623606 > > > > We found out that https://review.openstack.org/#/c/368760/ is breaking > > us, so we will revert it and work on it again later. > > > >> 2) ovb jobs are timeing out during NetworkDeployment because > >> 99-refresh-completed is not signaling to Heat due to instance-id being > >> detected as null by os-apply-config. > >> James proposed a revert: https://review.openstack.org/#/c/370250/ > >> But the patch can't be merged because of 1). > > > > We are going to merge James's revert, we think it will bring back OVB > jobs. > > > > To merge the reverts, we need to disable voting on multinode jobs: > > https://review.openstack.org/#/c/370922/ > > > > Please do not merge anything today (except the 2 reverts) until our > > situation becomes more stable. Probably tonight or tomorrow. > > Once situation is better, I or someone else in the team will give an > > update here. > > > > Thanks for your understanding, > > > >> I'll continue to work on it tomorrow but if you're able to jump in and > >> make progress on it, this downtime is very critical at this stage of > >> the cycle. > >> > >> Any help is highly welcome. > >> > >> Thanks, > >> -- > >> Emilien Macchi > > > > > > > > -- > > Emilien Macchi > > > > -- > Emilien Macchi > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [tripleo] CI is currently down: 2 blockers
So here's an update about current situation: Master / Newton gate-tripleo-ci-centos-7-ovb-nonha gate-tripleo-ci-centos-7-ovb-ha The 2 jobs are supposed to pass, but some jobs are timing out in RH1 cloud. In order to reduce the timeouts, Ben ran: heat-manage purge_deleted 3 nova-manage db archive_deleted_rows --verbose --max_rows 100 sudo mysqlcheck -o -A gate-tripleo-ci-centos-7-nonha-multinode We merged the revert: https://review.openstack.org/#/c/370250/ At the time I'm writing this email, the job is still non-voting: https://review.openstack.org/#/c/371133/ But hopefully Infra will merge this patch soon to bring it back in the gate. stable/mitaka and stable/liberty gate-tripleo-ci-centos-7-ovb-nonha works fine. gate-tripleo-ci-centos-7-ovb-ha is broken because Galera was updated in EPEL (and TripleO Mitaka still deploys EPEL). I have 2 patches in order to fix the situation: 1) Fix Galera configuration to work with recent EPEL (kudos to Damien for his help): https://review.openstack.org/#/c/371029/ 2) (not required but good to have) Disable EPEL in tripleoclient https://review.openstack.org/#/c/369559/ - I would understand if people -1 this patch and I have no strong opinion about it. I hope 1) will pass CI so we can just move forward. It's end of day for me but if someone can monitor http://tripleo.org/cistatus.html during Friday morning and make sure everything it still running fine, we would appreciate it. Also please report any bug related to CI and set the ci & alert tags. Thanks, and let's keep focusing on Newton release! On Thu, Sep 15, 2016 at 11:26 AM, Emilien Macchi wrote: > On Wed, Sep 14, 2016 at 10:13 PM, Emilien Macchi wrote: >> Hi, >> >> Just a heads-up before end of day: >> >> 1) multinode job is failing 80% of time. James and myself did some >> attempts to revert or fix things but we have been unfortunate until >> now. >> Everything is documented here: >> https://bugs.launchpad.net/tripleo/+bug/1623606 > > We found out that https://review.openstack.org/#/c/368760/ is breaking > us, so we will revert it and work on it again later. > >> 2) ovb jobs are timeing out during NetworkDeployment because >> 99-refresh-completed is not signaling to Heat due to instance-id being >> detected as null by os-apply-config. >> James proposed a revert: https://review.openstack.org/#/c/370250/ >> But the patch can't be merged because of 1). > > We are going to merge James's revert, we think it will bring back OVB jobs. > > To merge the reverts, we need to disable voting on multinode jobs: > https://review.openstack.org/#/c/370922/ > > Please do not merge anything today (except the 2 reverts) until our > situation becomes more stable. Probably tonight or tomorrow. > Once situation is better, I or someone else in the team will give an > update here. > > Thanks for your understanding, > >> I'll continue to work on it tomorrow but if you're able to jump in and >> make progress on it, this downtime is very critical at this stage of >> the cycle. >> >> Any help is highly welcome. >> >> Thanks, >> -- >> Emilien Macchi > > > > -- > Emilien Macchi -- Emilien Macchi __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [tripleo] CI is currently down: 2 blockers
On Wed, Sep 14, 2016 at 10:13 PM, Emilien Macchi wrote: > Hi, > > Just a heads-up before end of day: > > 1) multinode job is failing 80% of time. James and myself did some > attempts to revert or fix things but we have been unfortunate until > now. > Everything is documented here: https://bugs.launchpad.net/tripleo/+bug/1623606 We found out that https://review.openstack.org/#/c/368760/ is breaking us, so we will revert it and work on it again later. > 2) ovb jobs are timeing out during NetworkDeployment because > 99-refresh-completed is not signaling to Heat due to instance-id being > detected as null by os-apply-config. > James proposed a revert: https://review.openstack.org/#/c/370250/ > But the patch can't be merged because of 1). We are going to merge James's revert, we think it will bring back OVB jobs. To merge the reverts, we need to disable voting on multinode jobs: https://review.openstack.org/#/c/370922/ Please do not merge anything today (except the 2 reverts) until our situation becomes more stable. Probably tonight or tomorrow. Once situation is better, I or someone else in the team will give an update here. Thanks for your understanding, > I'll continue to work on it tomorrow but if you're able to jump in and > make progress on it, this downtime is very critical at this stage of > the cycle. > > Any help is highly welcome. > > Thanks, > -- > Emilien Macchi -- Emilien Macchi __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [tripleo] CI is currently down: 2 blockers
Hi, Just a heads-up before end of day: 1) multinode job is failing 80% of time. James and myself did some attempts to revert or fix things but we have been unfortunate until now. Everything is documented here: https://bugs.launchpad.net/tripleo/+bug/1623606 2) ovb jobs are timeing out during NetworkDeployment because 99-refresh-completed is not signaling to Heat due to instance-id being detected as null by os-apply-config. James proposed a revert: https://review.openstack.org/#/c/370250/ But the patch can't be merged because of 1). I'll continue to work on it tomorrow but if you're able to jump in and make progress on it, this downtime is very critical at this stage of the cycle. Any help is highly welcome. Thanks, -- Emilien Macchi __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev