On Mon, Dec 9, 2019 at 9:33 PM Kenneth Knowles <k...@apache.org> wrote:

>
>
> On Mon, Dec 9, 2019 at 6:34 PM Udi Meiri <eh...@google.com> wrote:
>
>> Valentyn, the speedup is due to parallelization.
>>
>> On Mon, Dec 9, 2019 at 6:12 PM Chad Dombrova <chad...@gmail.com> wrote:
>>
>>>
>>> On Mon, Dec 9, 2019 at 5:36 PM Udi Meiri <eh...@google.com> wrote:
>>>
>>>> I have given this some thought honestly don't know if splitting into
>>>> separate jobs will help.
>>>> - I have seen race conditions with running setuptools in parallel, so
>>>> more isolation is better.
>>>>
>>>
>>> What race conditions have you seen?  I think if we're doing things
>>> right, this should not be happening, but I don't think we're doing things
>>> right. One thing that I've noticed is that we're building into the source
>>> directory, but I also think we're also doing weird things like trying to
>>> copy the source directory beforehand.  I really think this system is
>>> tripping over many non-standard choices that have been made along the way.
>>> I have never these sorts of problems with in unittests that use tox, even
>>> when many are running in parallel.  I got pulled away from it, but I'm
>>> really hoping to address these issues here:
>>> https://github.com/apache/beam/pull/10038.
>>>
>>
>> This comment
>> <https://issues.apache.org/jira/browse/BEAM-8481?focusedCommentId=16988369&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16988369>
>> summarizes what I believe may be the issue (setuptools races).
>>
>> I believe copying the source directory was done in an effort to isolate
>> the parallel builds (setuptools, cythonize).
>>
>
> Peanut gallery: containerized Jenkins builds seem like they would help,
> and they are the current recommended best practice, but we are not there
> yet. Agree/disagree?
>

I'm okay with containerized Jenkins builds as long as using pytest/tox
directly still works.


>
> What benefits do you see from splitting up the jobs?
>>>>
>>>
>>> The biggest problem is that the jobs are doing too much and take too
>>> long.  This simple fact compounds all of the other problems.  It seems
>>> pretty obvious that we need to do less in each job, as long as the sum of
>>> all of these smaller jobs is not substantially longer than the one
>>> monolithic job.
>>>
>>
> For some reason I keep forgetting the answer to this question: are we
> caching pypi immutable artifacts on every Jenkins worker?
>

I don't know.


>
>>
>>> Benefits:
>>>
>>> - failures specific to a particular python version will be easier to
>>> spot in the jenkins error summary, and cheaper to re-queue.  right now the
>>> jenkins report mushes all of the failures together in a way that makes it
>>> nearly impossible to tell which python version they correspond to.  only
>>> the gradle scan gives you this insight, but it doesn't break the errors by
>>> test.
>>>
>>
>> I agree Jenkins handles duplicate test names pretty badly (reloading will
>> periodically give you a different result).
>>
>
> Saw this in Java too w/ ValidatesRunner suites when they ran in one
> Jenkins job. Worthwhile to avoid.
>
> Kenn
>
>
>> With pytest I've been able to set the suite name so that should help with
>> identification. (I need to add pytest*.xml collection to the Jenkins job
>> first)
>>
>>
>>> - failures common to all python versions will be reported to the user
>>> earlier, at which point they can cancel the other jobs if desired.  *this
>>> is by far the biggest benefit. * why wait for 2 hours to see the same
>>> failure reported for 5 versions of python?  if that had run on one version
>>> of python I could maybe see that error in 30 minutes (while potentially
>>> other python versions waited in the queue).  Repeat for each change pushed.
>>> - flaky jobs will be cheaper to requeue (since it will affect a
>>> smaller/shorter job)
>>> - if xdist is giving us the parallel boost we're hoping for we should
>>> get under the 2 hour mark every time
>>>
>>> Basically we're talking about getting feedback to users faster.
>>>
>>
>> +1
>>
>>
>>>
>>> I really don't mind pasting a few more phrases if it means faster
>>> feedback.
>>>
>>> -chad
>>>
>>>
>>>
>>>
>>>>
>>>> On Mon, Dec 9, 2019 at 4:17 PM Chad Dombrova <chad...@gmail.com> wrote:
>>>>
>>>>> After this PR goes in should we revisit breaking up the python tests
>>>>> into separate jenkins jobs by python version?  One of the problems with
>>>>> that plan originally was that we lost the parallelism that gradle provides
>>>>> because we were left with only one tox task per jenkins job, and so the
>>>>> total time to complete all python jenkins jobs went up a lot.  With
>>>>> pytest + xdist we should hopefully be able to keep the parallelism even
>>>>> with just one tox task.  This could be a big win.  I feel like I'm 
>>>>> spending
>>>>> more time monitoring and re-queuing timed-out jenkins jobs lately than I 
>>>>> am
>>>>> writing code.
>>>>>
>>>>> On Mon, Dec 9, 2019 at 10:32 AM Udi Meiri <eh...@google.com> wrote:
>>>>>
>>>>>> This PR <https://github.com/apache/beam/pull/10322> (in review)
>>>>>> migrates py27-gcp to using pytest.
>>>>>> It reduces the testPy2Gcp task down to ~13m
>>>>>> <https://scans.gradle.com/s/kj7ogemnd3toe/timeline?details=ancsbov425524>
>>>>>> (from ~45m). This speedup will probably be lower once all 8 tasks are 
>>>>>> using
>>>>>> pytest.
>>>>>> It also adds 5 previously uncollected tests.
>>>>>>
>>>>>

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to