Re: [Development] qtbase CI now takes 6 hours

Stephen Kelly Wed, 22 May 2013 03:09:11 -0700

On Tuesday, May 21, 2013 07:19:40 you wrote:
> Hi
> 
> I finally have some time to give you a proper answer :)


Thanks for all your responses!

> > 2) Can you notify this list when you know a CI blocker has been
> > introduced?
> > (like the improvements you mentioned which were held back until after
> > beta1
> > and which cause some problems, and the current network tests failing,
> > which
> > you might be aware of) That way we don't have to point a finger at you and
> > ask what's going on.
> 
> I'll take this as a lesson learned. I'll try to keep you more up to speed on
> what's going on :)

Great, thanks. 

I think we're learning lessons from this stuff already anyway. Today we 
disabled qtbase#dev integration because integrations will fail and notified 
the mailing list about it. These are things we just need to continue doing.

> > 3) Can you disable staging when such problems occur, so that everyone
> > knows
> > that it won't work anyway?
> 
> I wished I could have done that, but I don't know how to do that. I'll
> investigate how that could be achieved. Same thing crossed my mind though.

Ossi is able to do it, so he can probably answer any questions.

> > 4) You say you're responsible for keeping CI up and running. Does that
> > include keeping branches merging and keeping qt5.git up to date? If not,
> > then as I wrote before I think we still need a better way of tracking
> > that.
> 
> No. The CI team doesn't look or touch the content of the repos at all, not
> counting qtqa and sysadmin repos that include the tools which we keep up to
> date.

Ok, good to know.

> > 5) It's quite easy to see when the integration failed due to cloning, and
> > when it failed due to a problematic integration machine (eg
> > macx-clang_developer- build_qtnamespace_OSX_10.7), or a network server
> > issue. The problem is we don't really know if anything is being done to
> > address some of those issues. Is anything being done about
> > https://bugreports.qt-project.org/browse/QTBUG- 30646 ? Either fixing the
> > problems that are occuring
> > on it, or disabling it in places where it still causes problems (qtbase
> > and
> > qtdeclarative)?
> 
> Flaky test cases are trouble, we know. We try to mark them as insignificant
> as we stumble upon them, although it's not exactly our job in the CI. We
> can't look into every failing build and look at the details behind it, and
> keep metrics of failing cases. For this we have created the QtMetrics
> web-page, which I will send an announcement of probably right after I
> answer you on this.

Great, looking forward to seeing the page.

> Who the person would then be to mark them as insignificant? I'm not quite
> sure. 

That's something for us to figure out. I guess I'll try to write a separate 
mail about that.

> 
> > 7) Does your responsibility for keeping CI up and running include anything
> > to do with solving problems of integration which are not related to the
> > patches under test? Does it involve marking tests or platforms
> > insignificant where appropriate and in a timely way? If not, then that's
> > something we need to know, as I wrote before, so we can see if a
> > 'scramble and fix it' solution can work.
> 
> The configuration of Jenkins and building is our responsibility. So when we
> create a new platform, say Windows 8 32 bit might be coming soon, we
> initially mark it as 'forcesuccess'. If we notice after the build that it
> passes, and perhaps even all tests pass, we change the 'forcesuccess' and /
> or 'insignificant' appropriately. 

Yes. Going in the other direction when it causes problems (getting a platform 
from 'enforcing' to 'insignificant tests') is a process which is not well-
defined. Something also to discuss.

> We also take care of the building nodes /
> servers, so that they have the tools we need. We update visual studios,
> mingws, install Perl modules, update Puppet manifests etc. And if we notice
> in a build log, that some server for some reason lacked something, it's our
> job to investigate why it lacked it, was puppet out of sync or something,
> and we fix it. If we notice that some platform constantly fails (say
> windows 8 cmd.exe problem would occur every single time) we would just go
> ahead and mark it as 'forcesuccess'. Currently it passes from time to time,
> so we are keeping it as it is, although I know you'd like to see it changed
> ;) 

Yes :).

> Perhaps we should change this however...
> > The current situation of CI failing for reasons unrelated to the patches
> > under test is very frustrating. We need to find out who is responsible
> > for which parts of that and able to fix the issues.
> 
> True. We will also look at the option of adding a button somewhere that even
> though the build failed, if we know by looking at the results that the fail
> reason was unrelated to the change, we could bypass the checkpoint and
> forcefully merge it in Gerrit. This would help you in cases where we fail
> you :)

I can imagine that being controversial, and something we'd have to discuss. 
The likelyhood of 'oh, actually that patch did cause the problem' is non-zero. 
Maybe if the option was restricted to a limited set of people... Anyway, 
something for a more-visible discussion.

> 
> A lot of what I wrote here should be in a web page perhaps. But, with our
> resources available, the communication and information sharing toward the
> community is also something that will suffer among other things. I'll add
> this to our backlog however.

Thanks, I'll see if I can put together a wiki page about this in a few days 
either.

Thanks,

-- 
Stephen Kelly <[email protected]> | Software Engineer
KDAB (Deutschland) GmbH & Co.KG, a KDAB Group Company
www.kdab.com || Germany +49-30-521325470 || Sweden (HQ) +46-563-540090
KDAB - Qt Experts - Platform-Independent Software Solutions

signature.asc
Description: This is a digitally signed message part.

_______________________________________________
Development mailing list
[email protected]
http://lists.qt-project.org/mailman/listinfo/development

Re: [Development] qtbase CI now takes 6 hours

Reply via email to