i agree with all of this. but can we please break up the tests and make them shorter? :)
On Thu, Apr 2, 2015 at 8:54 AM, Nicholas Chammas <nicholas.cham...@gmail.com > wrote: > This is secondary to Marcelo’s question, but I wanted to comment on this: > > Its main limitation is more cultural than technical: you need to get people > to care about intermittent test runs, otherwise you can end up with > failures that nobody keeps on top of > > This is a problem that plagues Spark as well, but there *is* a technical > solution. > > The solution is simple: *All* the builds that we care about run for *every* > proposed change. If *any* build fails, the change doesn’t make it into the > repository. > > Spark already has a pull request builder that tests and reports back on > PRs. Committers don’t merge in PRs when this builder reports that it failed > some tests. That’s a good thing. > > The problem is that there are several other builds that we run on a fixed > interval, independent of the pull request builder. These builds test > different configurations, dependency versions, and environments than what > the PR builder covers. If one of those builds fails, it fails on its own > little island, with no-one to hear it scream. The build failure is detached > from the PR that caused it to fail. > > What should happen is that the whole matrix of stuff we care to test gets > run for every PR. No PR goes in if any build we care about fails for that > PR, and every build we care about runs for every commit of every PR. > > Really, this is just an extension of the basic idea of the PR builder. It > doesn’t make much sense to test stuff *after* it has been committed and > potentially broken things. And it becomes exponentially more difficult to > find and fix a problem the longer it has been festering in the repo. It’s > best to keep such problems out in the first place. > > With some more work on our CI infrastructure, I think this can be done. > Maybe even later this year. > > Nick > > On Thu, Apr 2, 2015 at 6:02 AM Steve Loughran ste...@hortonworks.com > <http://mailto:ste...@hortonworks.com> wrote: > > > > > On 2 Apr 2015, at 06:31, Patrick Wendell <pwend...@gmail.com> wrote: > > > > > > Hey Marcelo, > > > > > > Great question. Right now, some of the more active developers have an > > > account that allows them to log into this cluster to inspect logs (we > > > copy the logs from each run to a node on that cluster). The > > > infrastructure is maintained by the AMPLab. > > > > > > I will put you in touch the someone there who can get you an account. > > > > > > This is a short term solution. The longer term solution is to have > > > these scp'd regularly to an S3 bucket or somewhere people can get > > > access to them, but that's not ready yet. > > > > > > - Patrick > > > > > >> > > > > > > ASF Jenkins is always there to play with; committers/PMC members should > > just need to file a BUILD JIRA to get access. > > > > Its main limitation is more cultural than technical: you need to get > > people to care about intermittent test runs, otherwise you can end up > with > > failures that nobody keeps on top of > > https://builds.apache.org/view/H-L/view/Hadoop/ > > > > Someone really needs to own the "keep the builds working" problem -and > > have the ability to somehow kick others into fixing things. The latter is > > pretty hard cross-organisation > > > > > > >> That would be really helpful to debug build failures. The scalatest > > >> output isn't all that helpful. > > >> > > > > Potentially an issue with the test runner, rather than the tests > > themselves. > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > > For additional commands, e-mail: dev-h...@spark.apache.org > > > > >