Valentyn, the speedup is due to parallelization. On Mon, Dec 9, 2019 at 6:12 PM Chad Dombrova <chad...@gmail.com> wrote:
> > On Mon, Dec 9, 2019 at 5:36 PM Udi Meiri <eh...@google.com> wrote: > >> I have given this some thought honestly don't know if splitting into >> separate jobs will help. >> - I have seen race conditions with running setuptools in parallel, so >> more isolation is better. >> > > What race conditions have you seen? I think if we're doing things right, > this should not be happening, but I don't think we're doing things right. > One thing that I've noticed is that we're building into the source > directory, but I also think we're also doing weird things like trying to > copy the source directory beforehand. I really think this system is > tripping over many non-standard choices that have been made along the way. > I have never these sorts of problems with in unittests that use tox, even > when many are running in parallel. I got pulled away from it, but I'm > really hoping to address these issues here: > https://github.com/apache/beam/pull/10038. > This comment <https://issues.apache.org/jira/browse/BEAM-8481?focusedCommentId=16988369&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16988369> summarizes what I believe may be the issue (setuptools races). I believe copying the source directory was done in an effort to isolate the parallel builds (setuptools, cythonize). > >> What benefits do you see from splitting up the jobs? >> > > The biggest problem is that the jobs are doing too much and take too > long. This simple fact compounds all of the other problems. It seems > pretty obvious that we need to do less in each job, as long as the sum of > all of these smaller jobs is not substantially longer than the one > monolithic job. > > Benefits: > > - failures specific to a particular python version will be easier to spot > in the jenkins error summary, and cheaper to re-queue. right now the > jenkins report mushes all of the failures together in a way that makes it > nearly impossible to tell which python version they correspond to. only > the gradle scan gives you this insight, but it doesn't break the errors by > test. > I agree Jenkins handles duplicate test names pretty badly (reloading will periodically give you a different result). With pytest I've been able to set the suite name so that should help with identification. (I need to add pytest*.xml collection to the Jenkins job first) > - failures common to all python versions will be reported to the user > earlier, at which point they can cancel the other jobs if desired. *this > is by far the biggest benefit. * why wait for 2 hours to see the same > failure reported for 5 versions of python? if that had run on one version > of python I could maybe see that error in 30 minutes (while potentially > other python versions waited in the queue). Repeat for each change pushed. > - flaky jobs will be cheaper to requeue (since it will affect a > smaller/shorter job) > - if xdist is giving us the parallel boost we're hoping for we should get > under the 2 hour mark every time > > Basically we're talking about getting feedback to users faster. > +1 > > I really don't mind pasting a few more phrases if it means faster feedback. > > -chad > > > > >> >> On Mon, Dec 9, 2019 at 4:17 PM Chad Dombrova <chad...@gmail.com> wrote: >> >>> After this PR goes in should we revisit breaking up the python tests >>> into separate jenkins jobs by python version? One of the problems with >>> that plan originally was that we lost the parallelism that gradle provides >>> because we were left with only one tox task per jenkins job, and so the >>> total time to complete all python jenkins jobs went up a lot. With >>> pytest + xdist we should hopefully be able to keep the parallelism even >>> with just one tox task. This could be a big win. I feel like I'm spending >>> more time monitoring and re-queuing timed-out jenkins jobs lately than I am >>> writing code. >>> >>> On Mon, Dec 9, 2019 at 10:32 AM Udi Meiri <eh...@google.com> wrote: >>> >>>> This PR <https://github.com/apache/beam/pull/10322> (in review) >>>> migrates py27-gcp to using pytest. >>>> It reduces the testPy2Gcp task down to ~13m >>>> <https://scans.gradle.com/s/kj7ogemnd3toe/timeline?details=ancsbov425524> >>>> (from ~45m). This speedup will probably be lower once all 8 tasks are using >>>> pytest. >>>> It also adds 5 previously uncollected tests. >>>> >>>
smime.p7s
Description: S/MIME Cryptographic Signature