Of the three slowest tests as of last week, two has been fixed (thanks to evan and jcampan) and the third one has a fix being reviewed.
The Linux trybots build almost as fast as the Mac trybots now, thanks to shared ccache with a ~80% cache hit rate. Windows trybots are still struggling - taking 3X as long to build vs. Linux/Mac. On Mon, Sep 21, 2009 at 8:25 AM, Nicolas Sylvain <[email protected]> wrote: > Hi chromium-dev, > A small group of us joined forces to create a "Green Tree" task force. The > goal of this task > force is to make sure the tree stays green most of the time. The 2 main > pain points that > we are attacking at this time are "reducing the buildbot cycle time", to > catch errors earlier, and > "getting rid of the flakiness", to make sure the tree does not turn red for > no reason. > I'll be prepending "[Green Tree]" to the emails I send related to the task > force. > You can also follow the progress and our tasks > there: http://code.google.com/p/chromium/issues/list?q=label:GreenTreeTaskForce > For those interested, these are the highlights of the last week: > - Make sure all the tasks have bugs associated with them (pamg) > - Make sure VMWare Tools is installed on all the slaves (bev / nsylvain) > - Disable all services that we don't need on the slaves (bev) > - Split the windows chromium tests in 3 slaves (maruel) > - Change the gatekeeper to close the tree on more failures (maruel) > - Change LKGR to care about more tests, and make it cycle faster (maruel) > - Write a status page to see the cycle speed on the slaves (nsylvain) > - Make sure we build only what we need on Mac (thomasvl) > - Add more try bots (linux views, valgrind) (maruel) > - Refactor Linux Valgrind buildbots into builder/testers. (mmoss) > - Create a dashboard to see the slowest tests (phajdan) > - Speed up the transfer of builds between builders/testers by reducing the > compression (mmoss) > I'm sure I forgot some, feel free to append to this list. > Despite our efforts, this was one of the worse week we've seen in a long > time in term of tree closure. This > was caused by 5 main events: > - Buildbot maintenance went wrong. By changing a mounted drive on the > buildbot master, the mount table got corrupted, and we had to reboot the > main server. We started the maintenance at 7:30AM (pacific) and we got the > buildbot back online shortly after 10AM. It had to cycle a little, so it was > closed for almost 3 hours > - A webkit merge left some failures in the tree. And it looks like everyone > left without fixing it, so it was closed overnight. We fixed it in the > morning, but before reopening we let another webkit merge go by, and it also > broke the tree, requiring a change on webkit.org to fix the reliability > tests (IIRC). Total closure time: 20 hours. > - A bad gclient change got checked in. Some machines stopped running > "runhooks" and some bots got confused. The damage seems to have been > limited. > - A second bad gclient change got checked in. This time causing all the > bots to throw away their checkouts. Almost each slaves had to do a full > checkout (which takes an hour or so), and some of them ran out of disk > space, so we had to manually fix them. The tree was closed for another > couple of hours. > - A bad DEPS file got checked in. Causing again a bunch of slaves to throw > away their checkout. It was closed for another hour or two. > Nicolas > > > --~--~---------~--~----~------------~-------~--~----~ Chromium Developers mailing list: [email protected] View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~----------~----~----~----~------~----~------~--~---
