What we can do here is estimate how much effort we want to put in and set remote target. Such as: Third quarter 2018 -- 1hr SLO Forth quarter 2018 -- 30min SLO, etc.
Combined with policy for newly added tests, this can give us some goal to aim for. --Mikhail Have feedback <http://go/migryz-feedback>? On Mon, May 21, 2018 at 2:06 PM Scott Wegner <sweg...@google.com> wrote: > Thanks for the proposal, I left comments in the doc. Overall I think it's > a great idea. > > I've seen other projects with much faster pre-commits, and it requires > strict guidelines on unit test design and keeping tests isolated in-memory > as much as possible. That's not currently the case in Java; we have > pre-commits which submit pipelines to Dataflow service. > > I don't know if it's feasible to get Java down to 15-20 mins in the short > term, but a good starting point would be to document the requirements for a > test to run as pre-commit, and start enforcing it for new tests. > > > On Fri, May 18, 2018 at 3:25 PM Henning Rohde <hero...@google.com> wrote: > >> Good proposal. I think it should be considered in tandem with the "No >> commit on red post-commit" proposal and could be far more ambitious than 2 >> hours. For example, something in the <15-20 mins range, say, would be much >> less of an inconvenience to the development effort. Go takes ~3 mins, which >> means that it is practical to wait until a PR is green before asking anyone >> to look at it. If I need to wait for a Java or Python pre-commit, I task >> switch and come back later. If the post-commits are enforced to be green, >> we could possibly gain a much more productive flow at the cost of the >> occasional post-commit break, compared to now. Maybe IOs can be less >> extensively tested pre-commit, for example, or only if actually changed? >> >> I also like Robert's suggestion of spitting up pre-commits into something >> more fine-grained to get a clear partial signal quicker. If we have an >> adequate number of Jenkins slots, it might also speed things up overall. >> >> Thanks, >> Henning >> >> On Fri, May 18, 2018 at 12:30 PM Scott Wegner <sweg...@google.com> wrote: >> >>> re: intelligently skipping tests for code that doesn't change (i.e. Java >>> tests on Python PR): this should be possible. We already have build-caching >>> enabled in Gradle, but I believe it is local to the git workspace and >>> doesn't persist between Jenkins runs. >>> >>> With a quick search, I see there is a Jenkins Build Cacher Plugin [1] >>> that hooks into Gradle build cache and does exactly what we need. Does >>> anybody know whether we could get this enabled on our Jenkins? >>> >>> [1] https://wiki.jenkins.io/display/JENKINS/Job+Cacher+Plugin >>> >>> On Fri, May 18, 2018 at 12:08 PM Robert Bradshaw <rober...@google.com> >>> wrote: >>> >>>> [somehow my email got garbled...] >>>> >>>> Now that we're using gradle, perhaps we could be more intelligent about >>>> only running the affected tests? E.g. when you touch Python (or Go) you >>>> shouldn't need to run the Java precommit at all, which would reduce the >>>> latency for those PRs and also the time spent in queue. Presumably this >>>> could even be applied per-module for the Java tests. (Maybe a large, shared >>>> build cache could help here as well...) >>>> >>>> I also wouldn't be opposed to a quicker immediate signal, plus more >>>> extensive tests before actually merging. It's also nice to not have to wait >>>> an hour to see that you have a lint error; quick stuff like that could be >>>> signaled quickly before a contributor looses context. >>>> >>>> - Robert >>>> >>>> >>>> >>>> On Fri, May 18, 2018 at 5:55 AM Kenneth Knowles <k...@google.com> wrote: >>>> >>>>> I like the idea. I think it is a good time for the project to start >>>>> tracking this and keeping it usable. >>>>> >>>>> Certainly 2 hours is more than enough, is that not so? The Java >>>>> precommit seems to take <=40 minutes while Python takes ~20 and Go is so >>>>> fast it doesn't matter. Do we have enough stragglers that we don't >>>>> make it in the 95th percentile? Is the time spent in the Jenkins queue? >>>>> >>>>> For our current coverage, I'd be willing to go for: >>>>> >>>>> - 1 hr hard cap (someone better at stats could choose %ile) >>>>> - roll back or remove test from precommit if fix looks like more than >>>>> 1 week (roll back if it is perf degradation, remove test from precommit if >>>>> it is additional coverage that just doesn't fit in the time) >>>>> >>>>> There's a longer-term issue that doing a full build each time is >>>>> expected to linearly scale up with the size of our repo (it is the >>>>> monorepo >>>>> problem but for a minirepo) so there is no cap that is feasible until we >>>>> have effective cross-build caching. And my long-term goal would be <30 >>>>> minutes. At the latency of opening a pull request and then checking your >>>>> email that's not burdensome, but an hour is. >>>>> >>>>> Kenn >>>>> >>>>> On Thu, May 17, 2018 at 6:54 PM Udi Meiri <eh...@google.com> wrote: >>>>> >>>>>> HI, >>>>>> I have a proposal to improve contributor experience by keeping >>>>>> precommit times low. >>>>>> >>>>>> I'm looking to get community consensus and approval about: >>>>>> 1. How long should precommits take. 2 hours @95th percentile over the >>>>>> past 4 weeks is the current proposal. >>>>>> 2. The process for dealing with slowness. Do we: fix, roll back, >>>>>> remove a test from precommit? >>>>>> Rolling back if a fix is estimated to take longer than 2 weeks is the >>>>>> current proposal. >>>>>> >>>>>> >>>>>> https://docs.google.com/document/d/1udtvggmS2LTMmdwjEtZCcUQy6aQAiYTI3OrTP8CLfJM/edit?usp=sharing >>>>>> >>>>>