The 0.98 build is still showing this problem (latest as of now at https://builds.apache.org/job/hbase-0.98/803), so I went ahead and made the proposed change, but only to the 0.98 builds. I'll let you know if it provides any improvement.
On Sun, Jan 18, 2015 at 10:00 AM, Andrew Purtell <[email protected]> wrote: > Forked VMs are being killed in the 0.98 builds. That suggests > infrastructure issues. > > Having only one test execute in a forked runner does mean the finding of a > zombie and thread dumps or other state from the runner will identify and > characterize a sick test with no unrelated state mixed in. > > > > On Jan 17, 2015, at 7:43 PM, Stack <[email protected]> wrote: > > > > Agree, try anything to get our blues back. We add back the //ism after > all > > settles. > > > > Do you think something has changed in INFRA Andy? Is it more contended? > Or, > > more likely, is it that we've been committing stuff that has destabilized > > builds? We had a good streak of blue there for a while. It just took some > > work fixing breakage and watching jenkins to make sure breakage didn't > > sneak in, but we've lapsed for sure. > > > > St.Ack > > > >> On Sat, Jan 17, 2015 at 9:19 AM, Dima Spivak <[email protected]> > wrote: > >> > >> Not running tests in parallel will definitely cut down on Surefire > >> flakiness (and in contention that sometimes leads to false failures in > >> resource-hungry tests), but it will probably also balloon test run > times to > >> about two hours. Probably worth it in the short term, but we > >> eventually need to do something about some of these heavy tests. > >> > >> -Dima > >> > >> On Friday, January 16, 2015, Andrew Purtell <[email protected]> > >> wrote: > >> > >>> You might have missed the larger issue Ted. > >>> > >>> > >>>> On Jan 16, 2015, at 4:48 PM, Ted Yu <[email protected] > >> <javascript:;>> > >>> wrote: > >>>> > >>>> With HBASE-12874, we should get a green build for branch-1.0 > >>>> > >>>> FYI > >>>> > >>>> On Fri, Jan 16, 2015 at 12:20 PM, Andrew Purtell <[email protected] > >>> <javascript:;>> > >>>> wrote: > >>>> > >>>>> See BUILDS-49 tracking issues specifically with 0.98 jobs, but I just > >>>>> noticed trunk, branch-1, and branch-1.0 all failed after I checked in > >> a > >>>>> shell doc fix due to a timeout or fork failure. > >>>>> > >>>>> I propose we update all Jenkins jobs to not run tests in parallel, > >> i.e. > >>> add > >>>>> "-Dsurefire.firstPartForkCount=1 -Dsurefire.secondPartForkCount=1" > >>>>> > >>>>> -- > >>>>> Best regards, > >>>>> > >>>>> - Andy > >>>>> > >>>>> Problems worthy of attack prove their worth by hitting back. - Piet > >> Hein > >>>>> (via Tom White) > >> >
