Once we agree upon what the flow should be, I'm also in favor of reverting patches which break things. Such commits cause problems for all subsequent patches - where multiple people end up debugging the test. In terms of disabling tests - we may want to consider how often the test fails, before disabling them. There are flaky tests - and I suspect we'll end up with a lot of disabled tests if a test is disabled for a single failure. There have been offline suggestions about adding findbugs analysis, rat checks etc. The indentation checks, I think, falls into the same category. Would be really nice to hear from others as well. These changes will be painful to start with - but should help once the tests stabilize further.
On Fri, Oct 14, 2016 at 2:01 AM, Peter Vary <pv...@cloudera.com> wrote: > +1 from me too. > > I think it lowers the barrier for the contributors, if there are clean and > easy to follow rules for adding a patch. I had very good experience with > the Hadoop commit flow where only “all green” results are accepted. > > If we are thinking about changes in the commit flow, it would be great to > run code format checks before and after the patch, and call out any > increase of formatting errors. I see too many nits in the reviews where the > reviewer points out formatting problems. This could be avoided and the > reviewer can concentrate on the real code instead of formatting. > > After we decide the new commit flow we should update the documentation on > the wiki as well. > > Of course I am happy to help out with any of the related tasks too, so if > you need my help just say so :) > > Thanks, > Peter > > > On Oct 14, 2016, at 10:29 AM, Zsombor Klara <zsombor.kl...@cloudera.com> > wrote: > > > > +1 from my side as well. I also like the suggestion to revert a patch > > causing new test failures instead of expecting the owner to fix the > issues > > in follow up jiras. > > > > And I would also have a borderline "radical" suggestion.. if a flaky test > > is failing and the committer isn't sure at a glance that the failure is > > benign (stat diff failure for example I would consider low risk, race > > conditions on the other hand high risk), put an ignore annotation on it. > In > > my view a flaky test is pretty much useless. If it breaks people will > just > > ignore it (because it fails so often...) and anyway how can you be sure > > that the green run is the standard and the failure is the exception and > not > > the other way around. We have thousands of tests, ignoring a dozen won't > > decrease coverage significantly and will take us that much closer to a > > point where we can demand a green pre-commit run. > > > > Thanks > > > > On Fri, Oct 14, 2016 at 9:45 AM, Prasanth Jayachandran < > > pjayachand...@hortonworks.com> wrote: > > > >> +1 on the proposal. Adding more to it. > >> > >> A lot of time has been spent on improving the test runtime and bringing > >> down the flaky tests. > >> Following jiras should give an overview of the effort involved > >> https://issues.apache.org/jira/browse/HIVE-14547 > >> https://issues.apache.org/jira/browse/HIVE-13503 > >> > >> Committers please ensure that the reported failures are absolutely not > >> related > >> to the patch before committing it. > >> > >> I would also propose the following to maintain a clean and some tips to > >> maintain fast test runs > >> > >> 1) Revert patch that is causing a failure. It should be the > responsibility > >> of > >> the contributor to make sure the patch is not causing any failures. I am > >> against creating follow ups > >> for fixing test failures usually because it gets ignored or it gets > lower > >> priority causing wasted effort > >> and time for failure analysis for every other developers waiting to > commit > >> their patch. > >> > >> 2) +1 from reviewers AFTER a clean green run. Or if a reviewer is > >> convinced that test failures are unrelated. > >> May be we should stop conditional +1s and wait for clean green run. > >> > >> 3) Avoid slow tests (there is jira to print out runtime of newly added > >> tests). In general, its good > >> to have many smaller tests as opposed to single big tests. If the qfile > or > >> junit test is big, splitting it > >> up will help in parallelizing them and avoiding stragglers. > >> > >> 4) Avoid adding tests to MiniMr (slowest of all). > >> > >> 5) Try to keep the test runtime (see surefire-report to get correct > >> runtime without initialization time) under a minute. > >> > >> 6) Avoid adding more read-only tables to init script as this will > increase > >> the initialization time. > >> > >> 7) If the test case does not require explain plan then avoid it as most > >> failures are explain diffs. > >> > >> 8) If the test case requires explain and if it does not depend on table > or > >> partition stats explicitly set stats for the table or partition. > >> Explicitly setting stats will avoid expensive stats computation time and > >> avoids flakiness due to stats diff. > >> > >> 9) Prefer jUnit over qtest. > >> > >> 10) Add explicitly timeout for jUnit test to avoid indefinite hanging of > >> tests (surefire timeouts after 40 mins) > >> > >> Thoughts? > >> > >> Thanks > >> Prasanth > >> > >> On Oct 13, 2016, at 11:10 PM, Siddharth Seth <ss...@apache.org<mailto: > >> ss...@apache.org>> wrote: > >> > >> There's been a lot of work to make the test runs faster, as well as more > >> reliable via HIVE-14547, HIVE-13503, and several other jiras. Test > runtimes > >> are around the 1 hour mark, and going down. There were a few green > >> pre-commit runs (after years?). At the same time, there's still some > flaky > >> tests. > >> > >> We really should try to keep the test runtimes down, as well as the > number > >> of failures - so that the pre-commit runs can provide useful > information. > >> > >> I'm not sure what the current approach w.r.t precommit runs before a > >> commit. What I've seen in other projects is that the pre-commit needs to > >> run, and come back clean (mostly) before a commit goes in. Between what > >> used to be 5 day wait times, and inconsistent runs - I don't think this > is > >> always followed in Hive. > >> > >> It'll be useful to start relying on pre-commit test results again. Given > >> the flaky tests, I'd suggest the following > >> 1. Pre-commit must be run on a patch before committing (with very few > >> exceptions) > >> 2. A green test run is ideal > >> 3. In case there are failures - keep track of these as sub-jiras under a > >> flaky test umbrella jira (Some under HIVE-14547 already) - to be > eventually > >> fixed. > >> 4. Before committing - cite relevant jiras for a flaky test (create and > >> cite if it doesn't already exist). > >> > >> This should help us build up a list of flaky tests over various runs, > which > >> will hopefully get fixed at some point. > >> > >> Thoughts? > >> > >> Thanks, > >> Sid > >> > >> > >