On Tue, Apr 28, 2020 at 5:03 PM Robert Bradshaw <[email protected]> wrote:

> On Tue, Apr 28, 2020 at 4:46 PM Hannah Jiang <[email protected]>
> wrote:
>
>> I guess I assumed there was some reason we needed "lightweight images" in
>>> our tests (because licenses take up a lot of space IIRC), but maybe not.
>>> Can you elaborate on the purpose of this option Hannah?
>>
>> Reducing image size is a good reason to have the pull option. There were
>> user requests to create lightweight images. Now I think users can use skip
>> option to create lightweight iamges. pull option is not needed.
>>
>> Then the discussion becomes which one should be the default mode?
>> According to feedback, skip should be the default mode, and change it to
>> add mode when running Jenkins test. And the docker-pull-licenses tag is
>> binary again.
>> It seems like there is an easy way to pass the tag to all Jenkins test,
>> which is adding *context.switches("-Pdocker-pull-licenses"**)* to
>> CommonJobProperties.groovy
>> <https://github.com/apache/beam/blob/master/.test-infra/jenkins/CommonJobProperties.groovy#L150>.
>> I'm not very familiar with Jenkins, does this would work as expected?
>> If it sounds like a bad idea to pass this tag to ALL tests, I would ask
>> help from community to identify those tests which create docker images
>> (Java and Python for now) and pass the tag to the tests.
>>
>
> That could work.
>
> The other question is, how easy would it be for a release manager to
> accidentally push non-compliant images?
>

Echoing my previous question: would it make sense to add licenses whenever
we build containers on the release branch, which can be checked by [1]?

[1]
https://github.com/apache/beam/blob/74a6565c8b64d9fadf256370df47a4c5dadafb55/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L296




>
> We have definitely not explored how cheap we can make doing the "real
> thing" can be, which would IMHO be best.
>
>
>>
>>
>> On Tue, Apr 28, 2020 at 3:50 PM Kyle Weaver <[email protected]> wrote:
>>
>>> > To overwrite, we need to pass the tag for ALL Jenkins tests that
>>> create docker images, which is quite a bunch of work.
>>>
>>> If it comes to that, I'd rather we do the work of passing tags instead
>>> of users.
>>>
>>> > Does anyone know if there is an easy way to pass the tag to many
>>> tests? or overwrite the default mode, which will be defined at
>>> gradle.properties, to pull ONLY with Jenkins tests?
>>>
>>> There is precedence for the latter option:
>>> https://github.com/apache/beam/blob/1905dbde5ca0858fce89f65c761e88511840a384/build.gradle#L45
>>>
>>> > (On that note, however, pull seems bad for both.)
>>>
>>> I guess I assumed there was some reason we needed "lightweight images"
>>> in our tests (because licenses take up a lot of space IIRC), but maybe not.
>>> Can you elaborate on the purpose of this option Hannah?
>>>
>>> On Tue, Apr 28, 2020 at 6:40 PM Robert Bradshaw <[email protected]>
>>> wrote:
>>>
>>>> Fundamentally having license checking off by default is dangerous for
>>>> releases, but having it on by default is annoying for developers. (On that
>>>> note, however, pull seems bad for both.) Is it possible to make it a gradle
>>>> target that only runs when something (specifically dependencies, or the
>>>> files declaring them) have changed, and would leverage the gradle cache
>>>> (and hence be cheap) otherwise?
>>>>
>>>> Alternatively, I think we should invest in URL caching. These urls
>>>> should be immutable; let's only download them once, ever.
>>>>
>>>> On Tue, Apr 28, 2020 at 3:31 PM Hannah Jiang <[email protected]>
>>>> wrote:
>>>>
>>>>> Do you mean set the default mode to skip and overwrite it to pull mode
>>>>> with Jenkins test? To overwrite, we need to pass the tag for ALL Jenkins
>>>>> tests that create docker images, which is quite a bunch of work.
>>>>> Does anyone know if there is an easy way to pass the tag to many
>>>>> tests? or overwrite the default mode, which will be defined at
>>>>> gradle.properties, to pull ONLY with Jenkins tests? We can add a step to 
>>>>> do
>>>>> this after cloning a git branch to Jenkins machine. But it seems easier to
>>>>> change the local gradle.properties for local development purpose..?
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Apr 28, 2020 at 3:09 PM Thomas Weise <[email protected]> wrote:
>>>>>
>>>>>> Can this be solved by enabling the license magic for the Jenkins jobs?
>>>>>>
>>>>>> I also think the default should be off for better local development
>>>>>> experience.
>>>>>>
>>>>>> On Tue, Apr 28, 2020 at 3:05 PM Hannah Jiang <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> We need to make sure the release images contains licenses/notices in
>>>>>>> order to avoid potential legal issues. The purpose of setting the 
>>>>>>> default
>>>>>>> to pull is checking PRs which introduce new dependencies include thier
>>>>>>> licenses as well, in a way of auto pulling or tool pulling, to make sure
>>>>>>> all licenses are included when we create release images. If setting 
>>>>>>> default
>>>>>>> to skip, we only check licenses when we create release images and may 
>>>>>>> see
>>>>>>> many issues with the licenses, and the release would be delayed, maybe a
>>>>>>> lot.
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Apr 28, 2020 at 2:50 PM Kyle Weaver <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> The different modes make sense. One disagreement: I think the
>>>>>>>> default should be skip. I imagine few users would want to put licenses 
>>>>>>>> in
>>>>>>>> their own images.
>>>>>>>>
>>>>>>>> On Tue, Apr 28, 2020 at 5:36 PM Hannah Jiang <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>> The above 1,2,3,4 are merged to master.
>>>>>>>>> Here I would like to change behaviors with *docker-pull-licenses*
>>>>>>>>> tag and would like to collect feedbacks from community before 
>>>>>>>>> finalizing my
>>>>>>>>> PR.
>>>>>>>>>
>>>>>>>>> docker-pull-licenses can be set to one of ['add', 'pull', 'skip'].
>>>>>>>>>
>>>>>>>>>    - *docker-pull-licenses=add* pulls licenses/notices and the
>>>>>>>>>    files are added to docker images. This tag is used to create 
>>>>>>>>> release docker
>>>>>>>>>    images.
>>>>>>>>>    - *docker-pull-licenses=pull* pulls licenses/notices but do
>>>>>>>>>    not add the files to docker images so that create lightweight 
>>>>>>>>> images. This
>>>>>>>>>    is the default mode, and docker images created by Jenkins test use 
>>>>>>>>> this
>>>>>>>>>    mode. This will make sure the files are all pullable and detect 
>>>>>>>>> issues with
>>>>>>>>>    the tool, except the ADD part at DockerFile, which is rarely 
>>>>>>>>> likely to be
>>>>>>>>>    an issue. It is better than the checking URLs approach mentioned 
>>>>>>>>> above,
>>>>>>>>>    because it is a more similar process like creating release images 
>>>>>>>>> and code
>>>>>>>>>    can be simplified so that make it easier to maintain.
>>>>>>>>>    - *docker-pull-licenses=skip* skips all license pulling
>>>>>>>>>    related tasks. This tag is provided for users who customize their 
>>>>>>>>> images
>>>>>>>>>    with DockerFile provided by us and would not like to deal with 
>>>>>>>>> license
>>>>>>>>>    stuff. Without this tag, the users should change the .gradle file 
>>>>>>>>> to get
>>>>>>>>>    rid of it. It also can be used for local development when 
>>>>>>>>> developers want
>>>>>>>>>    to solve license related issues later.
>>>>>>>>>
>>>>>>>>> I would like to see this tag is adopted by all docker images
>>>>>>>>> released by Beam, including Flink and Spark job server images, and 
>>>>>>>>> share
>>>>>>>>> the same default mode.
>>>>>>>>> Does this sound good? Are there any concerns?
>>>>>>>>>
>>>>>>>>> Please let me know what you think.
>>>>>>>>> Hannah
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Apr 16, 2020 at 11:46 AM Hannah Jiang <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> Ok, so the fianl approach is
>>>>>>>>>> 1. Using multi threading with 16 threads.
>>>>>>>>>> 2. Introduce docker-pull-license tag, by default it is disabled.
>>>>>>>>>> 3. By default, only check if urls return 200, not actually pull
>>>>>>>>>> the files to reduce image size. Licenses add 85MB of image size as 
>>>>>>>>>> of today.
>>>>>>>>>> 4. Add retries to avoid flakiness. Initially it was added to
>>>>>>>>>> download part only, assuming urlib is stable when validates urls, 
>>>>>>>>>> but it is
>>>>>>>>>> not true.
>>>>>>>>>>
>>>>>>>>>> For caching, it is definitely useful. It will mitigate flakiness
>>>>>>>>>> further, like when github is down (many urls point to github). It 
>>>>>>>>>> also
>>>>>>>>>> reduces unnecessary internet traffic.
>>>>>>>>>> Let's reconsider it after the current work is merged to 2.21.0.
>>>>>>>>>>
>>>>>>>>>> Thank you all for providing feedback and ideas.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, Apr 16, 2020 at 10:24 AM Kyle Weaver <[email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> > I tried with multi processing and it improved the performance
>>>>>>>>>>> a lot.
>>>>>>>>>>>
>>>>>>>>>>> Great! Though it won't help with flakes, so as you said we
>>>>>>>>>>> should still look into caching as well.
>>>>>>>>>>>
>>>>>>>>>>> > You could try using a threadpool
>>>>>>>>>>>
>>>>>>>>>>> +1
>>>>>>>>>>>
>>>>>>>>>>> > However, we would like to include this work as part of 2.21.0
>>>>>>>>>>>
>>>>>>>>>>> I had marked the jira as a blocker for 2.21.0 because I was
>>>>>>>>>>> afraid something was broken, but now it looks like the failures 
>>>>>>>>>>> were just
>>>>>>>>>>> flakes. So BEAM-9764
>>>>>>>>>>> <https://issues.apache.org/jira/browse/BEAM-9764> should not be
>>>>>>>>>>> a release blocker.
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Apr 16, 2020 at 1:00 PM Thomas Weise <[email protected]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Hannah,
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for investigating!
>>>>>>>>>>>>
>>>>>>>>>>>> I think it would be great to eliminate the overhead for local
>>>>>>>>>>>> builds (by default turn off the license assembly) and make it as
>>>>>>>>>>>> lightweight as possible in the frequent CI path.
>>>>>>>>>>>>
>>>>>>>>>>>> Thomas
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Apr 16, 2020 at 1:37 AM Hannah Jiang <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I tried to check if urls are valid instead of pulling the
>>>>>>>>>>>>> files and it reduced only 1 min of running time. So, it's not an 
>>>>>>>>>>>>> option
>>>>>>>>>>>>> here.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I tried with multi processing and it improved the performance
>>>>>>>>>>>>> a lot.
>>>>>>>>>>>>> With 12 subprocesses, the running time reduced to 49 seconds,
>>>>>>>>>>>>> and with 16 cores, it reduced to 18 seconds.
>>>>>>>>>>>>> The number of subprocesses is defined by the number of cores,
>>>>>>>>>>>>> and Jenkins machine has 16 cores.
>>>>>>>>>>>>> FYI: with my local machine (12 cores) and home network, it
>>>>>>>>>>>>> takes 1min 40 secs to create a Java docker image.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The caching approach mentioned by Robert brings many benefits,
>>>>>>>>>>>>> not only to this use case.
>>>>>>>>>>>>> However, we would like to include this work as part of 2.21.0,
>>>>>>>>>>>>> so I will move with the multi processing approach this time.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please let me know if you have objections.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Apr 15, 2020 at 4:01 PM Robert Bradshaw <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Is the cost primarily in pulling these remote
>>>>>>>>>>>>>> licenses/sources? I'd guess that 99.9% of the URLs remain the 
>>>>>>>>>>>>>> same from run
>>>>>>>>>>>>>> to run. Would a simple cache, or caching proxy, be sufficient?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Otherwise, a tag to check that licenses can be pulled, but
>>>>>>>>>>>>>> not really pull them, might be sufficient. (Making sure the 
>>>>>>>>>>>>>> default is
>>>>>>>>>>>>>> cheap but we don't accidentally omit them when it matters is the 
>>>>>>>>>>>>>> tricky bit
>>>>>>>>>>>>>> I see here.)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Apr 15, 2020 at 3:38 PM Hannah Jiang <
>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks for providing feedback.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Here is what happending now and I would discuss when to run
>>>>>>>>>>>>>>> the job.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *Why it takes 7-8 mins for Java?*
>>>>>>>>>>>>>>> When we list dependencies both in runtime and compile
>>>>>>>>>>>>>>> environment, there are almost 1400 third party dependencies and 
>>>>>>>>>>>>>>> we need to
>>>>>>>>>>>>>>> pull licenses/notices for all of them.
>>>>>>>>>>>>>>> In addition, we need to pull source code if license is CDDL,
>>>>>>>>>>>>>>> MPL, GLP or LGPL. 69 of the dependencies need to pull the 
>>>>>>>>>>>>>>> source code as of
>>>>>>>>>>>>>>> 4/14/2020.
>>>>>>>>>>>>>>> Getting dependency list + pulling licenses/notices/source
>>>>>>>>>>>>>>> code takes 7-8 minutes.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Now I see there are *two patterns of failures*.
>>>>>>>>>>>>>>> 1. In valid URLs. In fact, the urls are not invalid, but
>>>>>>>>>>>>>>> occationally, it returns URLError. This can be resolved by 
>>>>>>>>>>>>>>> adding retries.
>>>>>>>>>>>>>>> However, it will add runtime to the job.
>>>>>>>>>>>>>>> 2. No artifacts available. Sometimes, when a new version of
>>>>>>>>>>>>>>> package is released  and the plugin still looks for staging 
>>>>>>>>>>>>>>> location. For
>>>>>>>>>>>>>>> example, new zetasql packages were released on 4/14, and today 
>>>>>>>>>>>>>>> I saw
>>>>>>>>>>>>>>> several failures with looking for staging repo. The behavior is 
>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>> consistent, sometimes it scans correct location, sometimes not. 
>>>>>>>>>>>>>>> This can be
>>>>>>>>>>>>>>> resolved by running the job again.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *When the job is running?*
>>>>>>>>>>>>>>> generateThirdPartyLicenses is added to :sdks:java:container
>>>>>>>>>>>>>>> and it is an upstream of the docker task. As such, whenever a 
>>>>>>>>>>>>>>> docker is
>>>>>>>>>>>>>>> created, the job is triggered.
>>>>>>>>>>>>>>> :sdks:java:container:docker is added to Java PreSubmit job.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *How to improve it?*
>>>>>>>>>>>>>>> According to some ideas provided above, how about doing this?
>>>>>>>>>>>>>>> Introduce a tag (ie: pull-licenses) to docker job to decide
>>>>>>>>>>>>>>> if pull the files. Default tag is NOT setting pull-licenses.
>>>>>>>>>>>>>>> When pull-licenses is not set, it checks if the
>>>>>>>>>>>>>>> licenses/notices/source code can be pull automaticall or they 
>>>>>>>>>>>>>>> have urls to
>>>>>>>>>>>>>>> pull from, but don't really pull.
>>>>>>>>>>>>>>> When pull-license is set, files are pulled.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> For each PR (Presubmit): applying default option. The test
>>>>>>>>>>>>>>> would fail if the files cannot be pulled, so committers still 
>>>>>>>>>>>>>>> need to fix
>>>>>>>>>>>>>>> dependency errors. I believe it would reduce the running time.
>>>>>>>>>>>>>>> For release: set the tag and pull the files and source code.
>>>>>>>>>>>>>>> Since it is checked for each PR, pulling should finish without 
>>>>>>>>>>>>>>> problems.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Please let me know what you think and if there are other
>>>>>>>>>>>>>>> things can be improved.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hannah
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Apr 15, 2020 at 2:30 PM Kyle Weaver <
>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Looks like the same error as this Jira:
>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/BEAM-9764
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Even if/when we are able to fix this particular issue, I
>>>>>>>>>>>>>>>> agree it is best not to run this job except for releases 
>>>>>>>>>>>>>>>> because of the
>>>>>>>>>>>>>>>> inherent network cost and possible reliability issues. +Hannah
>>>>>>>>>>>>>>>> Jiang <[email protected]> What do you think?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Apr 15, 2020 at 5:20 PM Thomas Weise <
>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The new feature to assemble licenses is very useful but
>>>>>>>>>>>>>>>>> appears to add several minutes (7-8?)  build time to jobs 
>>>>>>>>>>>>>>>>> that need to
>>>>>>>>>>>>>>>>> build a container.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Does it also seem to cause occasional build failures?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Phrase/131/
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Would it be possible to perform this task only during
>>>>>>>>>>>>>>>>> release builds?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Thomas
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>

Reply via email to