Re: maintaining stable HBase build

Ted Yu Sat, 24 Sep 2011 09:32:17 -0700

>> It should never have gone in if only to be reverted 35 minutes later.
(What happened?)


Since both Gary and Eugene have been working on HBASE-4014 for quite some
time, I didn't initially question the test cases.
After integrating the patch for TRUNK, I discovered that
TestRegionServerCoprocessorExceptionWithAbort failed consistently on Mac and
Linux. So I backed it out.
I first thought of disabling this particular test but later abandoned that
idea - if a core test fails, this means the feature may have issue.
I notified Eugene immediately and he will take a look today.

>> Scrolling down the commit history for trunk further, is a series of
half-commits, addendums, reverts, reverts of reverts, etc.

If you were talking about
HBASE-4132<https://issues.apache.org/jira/browse/HBASE-4132>,
I initially tried to salvage the JIRA by adjusting the triggering assertion.
However, that turned out to be not so trivial. So I reopened the JIRA.

Just FYI

On Sat, Sep 24, 2011 at 9:13 AM, Andrew Purtell <[email protected]> wrote:

> +1
>
> This:
> >>>
> > For contributors, I understand that it takes so much time to run whole
> test
> > suite that he/she may not have the luxury of doing this - Apache Jenkins
> > wouldn't do it when you press Submit Patch button.
> > If this is the case (let's call it scenario A), please use Eclipse (or
> other
> > tool) to identify tests that exercise the classes/methods in your patch
> and
> > run them. Also clearly state what tests you ran in the JIRA.
> <<<
>
> and
>
> >>>
> > For scenario A, I hope committer would run test suite.
>
> <<<
>
>
> should be added to the How To Contribute page, IMHO.
>
>
> I see that HBASE-4014 went in -- which is important, so let's fix it and
> try again -- and then went right out again, reverted after 35 minutes. It
> should never have gone in if only to be reverted 35 minutes later. (What
> happened?) Scrolling down the commit history for trunk further, is a series
> of half-commits, addendums, reverts, reverts of reverts, etc.
>
> It has recently become difficult to cherry pick any single commit from
> trunk andget all of the necessary parts of a change together or have any
> assurance the change is not toxic. This is not just a maintainer issue --
> diffing the full extent of a change to understand it fully mixes in
> unrelated changes between the initial commit and addendums, unless one
> resorts to octopus like contortions with git.
>
>
> So what is the solution? Submitted for your consideration:
>
>
> Committers should apply a candidate change and run the full test suite
> before committing the change to trunk or any branch. If applying a change to
> a branch, a full test suite run of the branch code should complete
> successfully before commit there as well.
>
> No patch is so pressing that it cannot wait for tests to finish before
> commit, IMO.
>
> If a test fails, the patch does not go in.
>
> If a test fails repeatedly for unrelated reasons, the test comes out and a
> jira to fix it gets opened.
>
> Finally, I can see where people are trying to fix the build, so please
> exclude
> those commits from my complaint here, that is not part of the problem.
> Best regards,
>
>
>        - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>
>
> ----- Original Message -----
> > From: Ted Yu <[email protected]>
> > To: [email protected]
> > Cc:
> > Sent: Saturday, September 24, 2011 3:51 AM
> > Subject: maintaining stable HBase build
> >
> > Hi,
> > I want to bring the importance of maintaining stable HBase build to our
> > attention.
> > A stable HBase build is important, not just for the next release but also
> > for authors of the pending patches to verify the correctness of their
> work.
> >
> > At some time on Thursday (Sept 22nd) 0.90, 0.92 and TRUNK builds were all
> > blue. Now they're all red.
> >
> > I don't mind fixing Jenkins build. But if we collectively adopt some good
> > practice, it would be easier to achieve the goal of having stable builds.
> >
> > For contributors, I understand that it takes so much time to run whole
> test
> > suite that he/she may not have the luxury of doing this - Apache Jenkins
> > wouldn't do it when you press Submit Patch button.
> > If this is the case (let's call it scenario A), please use Eclipse (or
> other
> > tool) to identify tests that exercise the classes/methods in your patch
> and
> > run them. Also clearly state what tests you ran in the JIRA.
> >
> > If you have a Linux box where you can run whole test suite, it would be
> nice
> > to utilize such resource and run whole suite. Then please state this fact
> on
> > the JIRA as well.
> > Considering Todd's suggestion of holding off commit for 24 hours after
> code
> > review, 2 hour test run isn't that long.
> >
> > Sometimes you may see the following (from 0.92 build 18):
> >
> > Tests run: 1004, Failures: 0, Errors: 0, Skipped: 21
> >
> > [INFO]
> ------------------------------------------------------------------------
> > [INFO] BUILD FAILURE
> > [INFO]
> ------------------------------------------------------------------------
> > [INFO] Total time: 1:51:41.797s
> >
> > You should examine the test summary above these lines and find out
> > which test(s) hung. For this case it was TestMasterFailover:
> >
> > Running org.apache.hadoop.hbase.master.TestMasterFailover
> > Running
> org.apache.hadoop.hbase.master.TestMasterRestartAfterDisablingTable
> > Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 32.265
> sec
> >
> > I think a script should be developed that parses test output and
> > identify hanging test(s).
> >
> > For scenario A, I hope committer would run test suite.
> > The net effect would be a statement on the JIRA, saying all tests passed.
> >
> > Your comments/suggestions are welcome.
> >
>

Re: maintaining stable HBase build

Reply via email to