Re: [openstack-dev] [tripleo] CI is currently down: 2 blockers

Dougal Matthews Fri, 16 Sep 2016 02:37:23 -0700

For those interested we now have a minimal way to reproduce the
MessagingTimeout in Mistral.


    https://bugs.launchpad.net/mistral/+bug/1624284

It seems to be related to this change in Mistral:


https://github.com/openstack/mistral/commit/1b0f0cddd620a3785017bb28d432cb0030b627d7

And even more specifically, this line:


https://github.com/openstack/mistral/commit/1b0f0cddd620a3785017bb28d432cb0030b627d7#diff-fa1c08d9053a1e6736fb8ac64e51d1ab

Thomas Herve managed to work around it by changing the executor.


On 16 September 2016 at 01:19, Emilien Macchi <[email protected]> wrote:

> So here's an update about current situation:
>
> Master / Newton
> gate-tripleo-ci-centos-7-ovb-nonha
> gate-tripleo-ci-centos-7-ovb-ha
> The 2 jobs are supposed to pass, but some jobs are timing out in RH1 cloud.
> In order to reduce the timeouts, Ben ran:
> heat-manage purge_deleted 3
> nova-manage db archive_deleted_rows --verbose --max_rows 1000000
> sudo mysqlcheck -o -A
>
> gate-tripleo-ci-centos-7-nonha-multinode
> We merged the revert: https://review.openstack.org/#/c/370250/
> At the time I'm writing this email, the job is still non-voting:
> https://review.openstack.org/#/c/371133/
> But hopefully Infra will merge this patch soon to bring it back in the
> gate.
>
>
> stable/mitaka and stable/liberty
> gate-tripleo-ci-centos-7-ovb-nonha works fine.
> gate-tripleo-ci-centos-7-ovb-ha is broken because Galera was updated
> in EPEL (and TripleO Mitaka still deploys EPEL).
> I have 2 patches in order to fix the situation:
> 1) Fix Galera configuration to work with recent EPEL (kudos to Damien
> for his help): https://review.openstack.org/#/c/371029/
> 2) (not required but good to have) Disable EPEL in tripleoclient
> https://review.openstack.org/#/c/369559/ - I would understand if
> people -1 this patch and I have no strong opinion about it.
>
> I hope 1) will pass CI so we can just move forward.
>
> It's end of day for me but if someone can monitor
> http://tripleo.org/cistatus.html during Friday morning and make sure
> everything it still running fine, we would appreciate it. Also please
> report any bug related to CI and set the ci & alert tags.
>
> Thanks, and let's keep focusing on Newton release!
>
> On Thu, Sep 15, 2016 at 11:26 AM, Emilien Macchi <[email protected]>
> wrote:
> > On Wed, Sep 14, 2016 at 10:13 PM, Emilien Macchi <[email protected]>
> wrote:
> >> Hi,
> >>
> >> Just a heads-up before end of day:
> >>
> >> 1) multinode job is failing 80% of time. James and myself did some
> >> attempts to revert or fix things but we have been unfortunate until
> >> now.
> >> Everything is documented here: https://bugs.launchpad.net/
> tripleo/+bug/1623606
> >
> > We found out that https://review.openstack.org/#/c/368760/ is breaking
> > us, so we will revert it and work on it again later.
> >
> >> 2) ovb jobs are timeing out during NetworkDeployment because
> >> 99-refresh-completed is not signaling to Heat due to instance-id being
> >> detected as null by os-apply-config.
> >> James proposed a revert: https://review.openstack.org/#/c/370250/
> >> But the patch can't be merged because of 1).
> >
> > We are going to merge James's revert, we think it will bring back OVB
> jobs.
> >
> > To merge the reverts, we need to disable voting on multinode jobs:
> > https://review.openstack.org/#/c/370922/
> >
> > Please do not merge anything today (except the 2 reverts) until our
> > situation becomes more stable. Probably tonight or tomorrow.
> > Once situation is better, I or someone else in the team will give an
> > update here.
> >
> > Thanks for your understanding,
> >
> >> I'll continue to work on it tomorrow but if you're able to jump in and
> >> make progress on it, this downtime is very critical at this stage of
> >> the cycle.
> >>
> >> Any help is highly welcome.
> >>
> >> Thanks,
> >> --
> >> Emilien Macchi
> >
> >
> >
> > --
> > Emilien Macchi
>
>
>
> --
> Emilien Macchi
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: [email protected]?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [tripleo] CI is currently down: 2 blockers

Reply via email to