I have written https://github.com/apache/beam/pull/9910 to reduce FnApiRunnerTest variations. I'm not in a rush to merge, but rather happy to start a discussion. I'll also try to figure out if there are other tests slowing down the suite significantly. Best -P.
On Fri, Oct 25, 2019 at 7:41 PM Valentyn Tymofieiev <valen...@google.com> wrote: > Thanks, Brian. > +Udi Meiri <eh...@google.com> > As next step, it would be good to know whether slowdown is caused by tests > in this PR, or its effect on other tests, and to confirm that only Python 2 > codepaths were affected. > > On Fri, Oct 25, 2019 at 6:35 PM Brian Hulette <bhule...@google.com> wrote: > >> I did a bisect based on the runtime of `./gradlew >> :sdks:python:test-suites:tox:py2:testPy2Gcp` around the commits between 9/1 >> and 9/15 to see if I could find the source of the spike that happened >> around 9/6. It looks like it was due to PR#9283 [1]. I thought maybe this >> search would reveal some mis-guided configuration change, but as far as I >> can tell 9283 just added a well-tested feature. I don't think there's >> anything to learn from that... I just wanted to circle back about it in >> case others are curious about that spike. >> >> I'm +1 on bumping some FnApiRunner configurations. >> >> Brian >> >> [1] https://github.com/apache/beam/pull/9283 >> >> On Fri, Oct 25, 2019 at 4:49 PM Pablo Estrada <pabl...@google.com> wrote: >> >>> I think it makes sense to remove some of the extra FnApiRunner >>> configurations. Perhaps some of the multiworkers and some of the grpc >>> versions? >>> Best >>> -P. >>> >>> On Fri, Oct 25, 2019 at 12:27 PM Robert Bradshaw <rober...@google.com> >>> wrote: >>> >>>> It looks like fn_api_runner_test.py is quite expensive, taking 10-15+ >>>> minutes on each version of Python. This test consists of a base class >>>> that is basically a validates runner suite, and is then run in several >>>> configurations, many more of which (including some expensive ones) >>>> have been added lately. >>>> >>>> class FnApiRunnerTest(unittest.TestCase): >>>> class FnApiRunnerTestWithGrpc(FnApiRunnerTest): >>>> class FnApiRunnerTestWithGrpcMultiThreaded(FnApiRunnerTest): >>>> class FnApiRunnerTestWithDisabledCaching(FnApiRunnerTest): >>>> class FnApiRunnerTestWithMultiWorkers(FnApiRunnerTest): >>>> class FnApiRunnerTestWithGrpcAndMultiWorkers(FnApiRunnerTest): >>>> class FnApiRunnerTestWithBundleRepeat(FnApiRunnerTest): >>>> class FnApiRunnerTestWithBundleRepeatAndMultiWorkers(FnApiRunnerTest): >>>> >>>> I'm not convinced we need to run all of these permutations, or at >>>> least not all tests in all permutations. >>>> >>>> On Fri, Oct 25, 2019 at 10:57 AM Valentyn Tymofieiev >>>> <valen...@google.com> wrote: >>>> > >>>> > I took another look at this and precommit ITs are already running in >>>> parallel, albeit in the same suite. However it appears Python precommits >>>> became slower, especially Python 2 precommits [35 min per suite x 3 >>>> suites], see [1]. Not sure yet what caused the increase, but precommits >>>> used to be faster. Perhaps we have added a slow test or a lot of new tests. >>>> > >>>> > [1] >>>> https://scans.gradle.com/s/jvcw5fpqfc64k/timeline?task=ancsbov425524 >>>> > >>>> > On Thu, Oct 24, 2019 at 4:53 PM Ahmet Altay <al...@google.com> wrote: >>>> >> >>>> >> Ack. Separating precommit ITs to a different suite sounds good. >>>> Anyone is interested in doing that? >>>> >> >>>> >> On Thu, Oct 24, 2019 at 2:41 PM Valentyn Tymofieiev < >>>> valen...@google.com> wrote: >>>> >>> >>>> >>> This should not increase the queue time substantially, since >>>> precommit ITs are running sequentially with precommit tests, unlike >>>> multiple precommit tests which run in parallel to each other. >>>> >>> >>>> >>> The precommit ITs we run are batch and streaming wordcount tests on >>>> Py2 and one Py3 version, so it's not a lot of tests. >>>> >>> >>>> >>> On Thu, Oct 24, 2019 at 1:07 PM Ahmet Altay <al...@google.com> >>>> wrote: >>>> >>>> >>>> >>>> +1 to separating ITs from precommit. Downside would be, when Chad >>>> tried to do something similar [1] it was noted that the total time to run >>>> all precommit tests would increase and also potentially increase the queue >>>> time. >>>> >>>> >>>> >>>> Another alternative, we could run a smaller set of IT tests in >>>> precommits and run the whole suite as part of post commit tests. >>>> >>>> >>>> >>>> [1] https://github.com/apache/beam/pull/9642 >>>> >>>> >>>> >>>> On Thu, Oct 24, 2019 at 12:15 PM Valentyn Tymofieiev < >>>> valen...@google.com> wrote: >>>> >>>>> >>>> >>>>> One improvement could be move to Precommit IT tests into a >>>> separate suite from precommit tests, and run it in parallel. >>>> >>>>> >>>> >>>>> On Thu, Oct 24, 2019 at 11:41 AM Brian Hulette < >>>> bhule...@google.com> wrote: >>>> >>>>>> >>>> >>>>>> Python Precommits are taking quite a while now [1]. Just >>>> visually it looks like the average length is 1.5h or so, but it spikes up >>>> to 2h. I've had several precommit runs get aborted due to the 2 hour limit. >>>> >>>>>> >>>> >>>>>> It looks like there was a spike up above 1h back on 9/6 and the >>>> duration has been steadily rising since then. Is there anything we can do >>>> about this? >>>> >>>>>> >>>> >>>>>> Brian >>>> >>>>>> >>>> >>>>>> [1] >>>> http://104.154.241.245/d/_TNndF2iz/pre-commit-test-latency?orgId=1&from=now-90d&to=now&fullscreen&panelId=4 >>>> >>>