Hi all, I know many of you are interested in knowing how the CI is doing. We have been looking for a while how to improve our communication in this area and of course how to get the whole process running smoother for everyone.
>From now on we'll have weekly meetings, similar to the release team. Each Tuesday 13:00 CEST ( http://www.timeanddate.com/worldclock/fixedtime.html?msg=Qt+QA+meeting&iso=20131022T13&p1=187&ah=1 ) Since we just had a first attempt at getting this going, here is a little summary and the IRC log below. * We have openSuse machines integrated in the CI (already announced on this list), OpenSSL was missing but will be on them from now on. * Android: tests not yet running, needs some help from Android team (androiddeployqt missing). Basic infrastructure in place. * Some issues with V4 on ARM are still being fixed, should be done today though. * QQuick2 test flakyness: * lots of timing dependent tests * maybe use cpu time (bogomips) for Q_TRY_VERIFY and friends * maybe check the qml engine for running animations (would need new api) * network test server: we'd like to update it but nobody is actively working on it right now (the server based on Ubuntu 12.4 is almost working, some tests still fail). * long term we would like to have more reliability by running defined snapshots of VMs for the tests, currently the test machines simply keep on running. Cheers, Frederik <tosaraja> ok, without further delaying this for no reason, let's begin this :) So hello everybody <lars> fregl: here as well <olhirvon> tosaraja is going to lead the discussion <fregl> great :) hi lars <ablasche> hi <pejarven> o/ <tosaraja> I didn't prepare much of an agenda, mainly just thought of something we could discuss about. To starters I could tell you about the current activities in the CI, what we are doing and what problems we might have. And then if you have any questions for us or want to discuss something, please do tell whenever you feel like it <fregl> I guess these meetings are still new, so we'll find the best structure over time, but for now I'd say we can quickly go though current issues <fregl> one current think that is interesting to me: sifalt you found we don't have openssl everywhere? <tosaraja> SuSE's got their OpenSSL development library just 20 minutes ago https://codereview.qt-project.org/#change,68203 <fregl> tosaraja: manually or using puppet? <tosaraja> fregl: using puppet...so you can add 15 minutes to that ;) <sifalt> fregl: yes, Opensuse and ince70embedded env <tosaraja> the embedded is still unsolved <fregl> ok, that I less bad than I thought :) <fregl> does it block tests on wince or do we simply not run them? <tosaraja> Which test is run for openssl? How did you notice it was missing from suse? <fregl> nierob_: ^ <sahumada> fregl: we dont run tests for wince <fregl> I guess enginio fails without openssl, otherwise we probably just skip the tests when running make check <fregl> sahumada: ok, then I think this is not an urgent issue <nierob_> by default if Qt is compiled without openssl the tests are not executed <nierob_> fregl: they are ifdef'ed <fregl> fkleint: since we have lars here, maybe we can talk about the quick2 tests? you have tried to stabilize them and did not get too far, right? <fregl> tosaraja: what do you have on the agenda? we just talked on Friday so I don't have much else right now. any general status update? <lars> fregl: I'm working on that right now (in a way). I'm going through our tests and check them against GC corruptions (found a few already). <lars> that will hopefully help, but we'll only see over time <fkleint> fregl: Erik mainly tried (see mail) <fkleint> fregl: he found that he had to insert arbitrary sleeps <fkleint> but that is not satisfactory <tosaraja> fregl: not much really. After we get the current discussions out of the way, i was going to tell you about the blockers in other areas <fregl> so one point Eric writes in the mail is that make check does not skip insignificant tests - any comments on that? tosaraja sahumada? <fkleint> lars: We are facing the problem that the stuff shows quite some non-deterministic behaviour due to the multithreadedness and different render loops <sahumada> fregl: dont know such an email :) <fkleint> lars: Talked to Squish folks at DevDays and they are facing the same issue <fkleint> lars: You basically have to wait for animations to finish, etc & frames synced before proceeding with thr test <fregl> sahumada: forwarded <fkleint> lars: that is quite hard on machines under load <fkleint> basically increase sleep until it passes ;-) <fregl> I think in a way we keep on hitting the same old issue: timers make tests really non-deterministic and machines run sometimes under load <fkleint> I wonder if we could have an API to check whether animations are stopped and synced <tosaraja> fregl: sahumada: I haven't looked at the script running tests, but i would imagine that make check doesn't read the files having the insignificant flags, but our perl scripts do. We would have to transfer the logic from perl -> make to enable that. I might be totally wrong here, but i suspect this is how it works <fkleint> QTRY_VERIFY(QuickEngine.idle() ) or similar <lars> fkleint: fregl: yes, that's one big issue. best option is probably to talk to gunnar about it. <fkleint> hm,ok <lars> fkleint: we fixed some issues by speeding up animations and using proper waitForWindowShown etc... <lars> fkleint: did them when gunnar was here 2 weeks ago <lars> I think the listview test is a lot better now for example. <fkleint> yep, but CI machines can be under load & really slow <fregl> another idea that janarve had was to make tests work with cpu time instead of wall time or such <fkleint> but that might differ across platforms <janarve> yes <janarve> actually per-process time <fregl> maybe someone can tell if that would be a sensible way of going about it. otherwise we can make it an action point to check with gunnar about animations <lars> fregl: the problem is that we're using QTRY_COMPARE quite a few places, and that'll time out. so we'd need a different way that tells us that animations are done. <lars> but we really shouldn't even get close to 5 secs with any of our animations. slows down tests a lot as well <nierob_> I thought that animations are using vsync which is the real time. So using process time will not help <janarve> lars: I agree. I think measuring process-time would solve that <nierob_> but it would help in the network tests <janarve> nierob_: no, vsync is just for drawing. It still uses a timer to measure how far the animation should go <nierob_> ok <janarve> Btw, do we have evidence that QTRY_COMPARE is really a big problem? <fkleint> there is an "overload" where you can specify the timeout? <janarve> yes, but its still not bullet proof <fkleint> or another macro, QTRY_COMPARE_TIMEOUT or so <nierob_> fkleint: yes, but the problem remains <fkleint> ok, so, it should be possible to specify process time..there <fkleint> hm <janarve> If we measured process-time, we could have QTRY_COMPARE_BOGOMIPS(how_many_bogomips_until_timeout) <fkleint> Anyone up for maintaining QTestlib btw? <janarve> the problem is that that parameter would be *very* large and not very good to guess <fkleint> then we need to add another set of macros.. <fregl> I don't think anyone looks at qtestlib at the moment <fkleint> that macro-riddled design is a bit suboptimal.. <nierob_> janarve: QtestLib could query the cpu before running the tests and get bogomips stats <nierob_> janarve: then it could estimate waiting time <janarve> fkleint: well, all comparing in testlib are based on macros <janarve> nierob_: so what number would you give to the macro? <fregl> ok, janarve, fkleint, nierob_ do you think we can start a task force to try implementing this? figuring out sane numbers can be done when we have a proof of concept I would say <nierob_> janarve: it would be nice to accept time, which could be recomputed to bogomips <fregl> we don't need to solve it right here and now <janarve> I am interested <janarve> fregl: ^ <ossi|tt> i don't think using cpu time is any good. it doesn't buy anything if some events get lost, etc. <nierob_> me too <fregl> ossi|tt: how do events get lost? <fkleint> but coming back to Quick2, would some API lilke QmlEngine::idle() help? <fkleint> (also thinking Squish) <fregl> nierob_: janarve: great, you have an AP. I guess it might be a combination of both times. <ossi|tt> fregl: typically they are not sent ;) <fregl> ossi|tt: that sounds like a bug that is in need of fixing then <janarve> I'll discuss with nierob_ <ossi|tt> fregl: yes. we are talking about autotest ;) <fregl> fkleint: the problem is animations running indefinitely <fkleint> yes, except those basically <fregl> ossi|tt: yes, so if events get lost, we better not fix the test but the code that loses them <ossi|tt> fregl: the point is that relying on cpu time will simply make some failing tests wait forever <fkleint> foreach (QAnimation *a) if (a->isRunning() && !isInDefinitely) return false <ossi|tt> fregl: just don't rely on anything that relates to the program doing something particular <fregl> ok, so we need to talk to gunnar about that, is he on irc atm? <fkleint> Hm. can';t see him <ossi|tt> fregl: wall time is the best you can get. make a long enough timeout, and make sure that the good case will *not* need much time. <sletta> fregl: here <fkleint> ossi|tt: Enter the multihtreaded world of Quick2 ;-) this is not widgets <fregl> sletta: fkleint had some good question relating to animations in quick <ossi|tt> fkleint: i'm not sure what this has to do with anything <fregl> sletta: basically we were discussing flaky tests - some depend on animations finishing <tosaraja> Can systems where CPU speed and BUS speeds are adjustable according to load mess up things like calculating CPU time? <fregl> sletta: is there a way to check if there are running animation (apart from those runing for ever) <fkleint> sletta: backlog at http://paste.kde.org/p7fe90954 <ossi|tt> tosaraja: the kernel is supposed to consider that when giving back cpu time. but anyway, as i said, i don't think cpu time is a good idea. <tosaraja> ossi|tt: right <ossi|tt> tosaraja: it may be a good idea to use resource-limited cgroups to contain amok-running tests, though. <fregl> ossi|tt: I think we can let nierob_ and janarve try to prototype something and if that fails it fails. <tosaraja> ossi|tt: are you volunteering to investigate this? ;) <fkleint> Quick2 rather tend to not run amok ,but jut fail.. <ossi|tt> tosaraja: not really. but fundamentally it isn't hard ^^ <ossi|tt> fregl: it has already failed, because logic says that it's not going to be reliable by design <fregl> ossi|tt: I guess that's linux only though. and I'd like to move on to the qquick2 problems <ossi|tt> fregl: yes, it is <fregl> ok, lets take the cpu time after this meeting. <sletta> fregl: we don't have an API to query for running animations or to separate between indefinitely running animation and animations which have a definite unknown finish time <nierob_> sletta: could we have it? <sletta> how would that help_ <fkleint> QTRY_VERIFY(QMlEngine::idle()) <fkleint> then proceed <fkleint> with testing <tosaraja> ok, if you pick this up after the meeting or have another thread here on the side (think we can manage that) , I'd have another for you: compiling V4 to ARM. It seems like Blackberry is stumbling upon the same problem as we now have <sletta> that still excludes metainvokes, timers and other async behavior which is heavily used throughout Qt. I'm not sure that will fix anything <fkleint> OK, so, everyone can sleep over it and maybe develop some ideas <fregl> lars: v4 and arm is being fixed if I understand correctly? <tosaraja> it was already discussed on the release mailing list shortly. Do we have anyone doing the implementation to ARM? (currently it's only for THUMB) <fregl> tosaraja: actually I think Simon is working on that right now <sletta> Why does QTRY_VERIFY time out in the first place... If it ends up hanging, the test will be killed anyway <tosaraja> fregl: great! :) <lars> fregl: yes, tronical has a fix he's now cleaning up <lars> fregl: arm is actually mostly working, android was the issue :) <fregl> tosaraja: so that one should be there in a day or two <lars> fregl: erikv is also working on some arm related issues <tosaraja> fregl: I'll start a build now and then...then :) <lars> tosaraja: fregl: we hopefully have it all fixed tonight... <tosaraja> Then we have Android testing as an issue <fregl> tosaraja: so let's see if it works tomorrow, the fix needs to go though integration first I guess ;) <tosaraja> sifalt was trying with the latest changes, but was missing something. Do we have Simo here? <lars> fregl: we've had little issues with CI on declarative lately, so I'm positive <sifalt> tosaraja: I cherrypicked eskil's chnages from gerrit, but it looked like i was sitll missing something <olhirvon> sifalt: ^^ <fregl> sifalt: maybe best talk to eskil directly <sifalt> it was nagging about missing androiddeployqt <tosaraja> getting those androids running tests shouldn't be that hard. We already have a line of 10 tablets waiting to be tested on. When we get the first machine set up and it manages to test correctly, we can just clone it and we will have 10 machines connected to 10 tablets running the tests <tosaraja> meaning, we can have several submodules verified on android. <tosaraja> ios on simulator is something I have on my todo list, and will start working on that probably already this week <fregl> tosaraja: are the machines physical? <tosaraja> fregl: no, 10 virtual machines connected to 10 tablets. I will hard-code to environment variables the different IP addresses of the tablets <fregl> right, so the VMs can access usb, sounds good. <tosaraja> fregl: until we come up with how to easily create a pool of devices and maintain a state machine on which is occupied and which is free, i thought this would be the easy way out <fregl> tosaraja: sounds sensible to me <tosaraja> fregl: the connection is done over IP, not USB <fregl> ok <tosaraja> fregl: the tablet's ADB server is set to listen to IP <fregl> well, seems like you have that under control <tosaraja> then... the new network test server <tosaraja> it hasn't gotten any progress lately. <nierob_> btw. digia used to sell a solution for multiple mobile testing / code execution so maybe we have such thing ready <tosaraja> afaik, peter-h fixed tests to work on the new server, but as far as the remainder goes, I don't think we have gotten anywhere <tosaraja> we still have a few tests failing preventing us from upgrading <tosaraja> With all these other things on my table, I haven't had time to anything about this. And I recon that neither does anyone else ;) <tosaraja> But to more promising news... having CI create VMs on demand... and always using a clean clone is progressing. <tosaraja> Qt has finally gotten its own vSphere to build on. We have enterprise version of jenkins installed as a trial <tosaraja> Now I can start playing around with it and see how it connects and how it works. We should also have the APIs for vSphere available now if we wanted to create the modules for Jenkins ourselves. <tosaraja> Do we have anyeone here that has done that previously? Do new plugins for Jenkins i mean <tosaraja> If we did that work ourselves, we could still continue using the free version of Jenkins <fregl> that means we can basically start snapshotting CI machines and rolling back to old stages. Maybe we should consider moving in that direction and depending less on puppet. <tosaraja> Yes, we could scrap puppet after that <fregl> I think as a first step that snapshotting sounds more important than also starting VMs on demand <tosaraja> and each time we did a modification to the template, we could run a whole Qt build and run the tests on it to see that we don't break anything <fregl> yes <fregl> we seem to have the capacity to run all those machines now, so that should not change <tosaraja> fregl: snapshotting from the current situation is a bad idea, since the machines are already out of sync and not clones. We would need to re- create every machine first, then create a base snapshot and continue from there <fregl> if the puppet (or manual) update is run on only one machine at a time, as describe above, we can then start using that to test everything and if it works indeed use it as new template for the other VMs <tosaraja> fregl: and having the on demand would do that for us <fregl> yes, of course start the snapshots from a clean slate <fregl> tosaraja: but how do you teach jenkins that it can just create machines? or do we pretend to jenkins that we still have a limited number of machines running somehow? <tosaraja> fregl: the first step jenkins would do is create a new node for itself and as that node starts it would create a VM that connected to itself <tosaraja> fregl: at least that's how i pictured myself doing it <fregl> tosaraja: I don't understand. how does it create a node? is a node not a running machine already? <tosaraja> fregl: I don't really know how Jenkins manages to create a new node on the fly, and how it would start working on it as it would be offline until something connected to it.... it might be that some master node would have to take care of the initial setup <tosaraja> fregl: ^ still to test and figure out ;) <fregl> tosaraja: so I think this is rather hard, that is why I would start with a pool of machines first. These can then be reverted back to a snapshot after the test run. <sifalt> tosaraja: fregl: I don't think it is a problem. We are already having a situation where the master thinks that there is more nodes than there actually is <tosaraja> fregl: how would we manage to put puppet into all that? If we had a snapshot and we need to run puppet on it. we would need to update the snapshot <fregl> sifalt: ok, if that can be done, that's really great. <tosaraja> ok...but it's past 3 o clock here and the time is up for this meeting <fregl> tosaraja: I would imagine: run the snapshot, let puppet run, take new snapshot <fregl> ok, let's just meet again next Tuesday. <fregl> can someone write a short summary? we can attach the backlog. <tosaraja> obviously nothing much changes regading the usage of this channel... whenever you need us or eachother, we will be here. I'll just be heading home now. Thanks and bye :) <fregl> tosaraja: yes. and it's good to have a bit of focused time for this :) <tosaraja> indeed <olhirvon> This seems to be useful :) Thanks everyone for active participation. -- Best regards, Frederik Gladhorn Senior Software Engineer - Digia, Qt Visit us on: http://qt.digia.com _______________________________________________ Development mailing list [email protected] http://lists.qt-project.org/mailman/listinfo/development
