How about we pull/add dependencies when either: - The container build target is executed for a release branch [1]. - We are running the build target on Jenkins: - some environmental variable tells us that we are running on Jenkins - path to repo starts with /home/jenkins - perhaps there is a better way to check this.
Would this mitigate concerns for local development mentioned here? [1] https://github.com/apache/beam/blob/74a6565c8b64d9fadf256370df47a4c5dadafb55/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L296 On Tue, Apr 28, 2020 at 3:50 PM Kyle Weaver <[email protected]> wrote: > > To overwrite, we need to pass the tag for ALL Jenkins tests that create > docker images, which is quite a bunch of work. > > If it comes to that, I'd rather we do the work of passing tags instead of > users. > > > Does anyone know if there is an easy way to pass the tag to many tests? > or overwrite the default mode, which will be defined at gradle.properties, > to pull ONLY with Jenkins tests? > > There is precedence for the latter option: > https://github.com/apache/beam/blob/1905dbde5ca0858fce89f65c761e88511840a384/build.gradle#L45 > > > (On that note, however, pull seems bad for both.) > > I guess I assumed there was some reason we needed "lightweight images" in > our tests (because licenses take up a lot of space IIRC), but maybe not. > Can you elaborate on the purpose of this option Hannah? > > On Tue, Apr 28, 2020 at 6:40 PM Robert Bradshaw <[email protected]> > wrote: > >> Fundamentally having license checking off by default is dangerous for >> releases, but having it on by default is annoying for developers. (On that >> note, however, pull seems bad for both.) Is it possible to make it a gradle >> target that only runs when something (specifically dependencies, or the >> files declaring them) have changed, and would leverage the gradle cache >> (and hence be cheap) otherwise? >> >> Alternatively, I think we should invest in URL caching. These urls should >> be immutable; let's only download them once, ever. >> >> On Tue, Apr 28, 2020 at 3:31 PM Hannah Jiang <[email protected]> >> wrote: >> >>> Do you mean set the default mode to skip and overwrite it to pull mode >>> with Jenkins test? To overwrite, we need to pass the tag for ALL Jenkins >>> tests that create docker images, which is quite a bunch of work. >>> Does anyone know if there is an easy way to pass the tag to many tests? >>> or overwrite the default mode, which will be defined at gradle.properties, >>> to pull ONLY with Jenkins tests? We can add a step to do this after cloning >>> a git branch to Jenkins machine. But it seems easier to change the local >>> gradle.properties for local development purpose..? >>> >>> >>> >>> On Tue, Apr 28, 2020 at 3:09 PM Thomas Weise <[email protected]> wrote: >>> >>>> Can this be solved by enabling the license magic for the Jenkins jobs? >>>> >>>> I also think the default should be off for better local development >>>> experience. >>>> >>>> On Tue, Apr 28, 2020 at 3:05 PM Hannah Jiang <[email protected]> >>>> wrote: >>>> >>>>> We need to make sure the release images contains licenses/notices in >>>>> order to avoid potential legal issues. The purpose of setting the default >>>>> to pull is checking PRs which introduce new dependencies include thier >>>>> licenses as well, in a way of auto pulling or tool pulling, to make sure >>>>> all licenses are included when we create release images. If setting >>>>> default >>>>> to skip, we only check licenses when we create release images and may see >>>>> many issues with the licenses, and the release would be delayed, maybe a >>>>> lot. >>>>> >>>>> >>>>> On Tue, Apr 28, 2020 at 2:50 PM Kyle Weaver <[email protected]> >>>>> wrote: >>>>> >>>>>> The different modes make sense. One disagreement: I think the default >>>>>> should be skip. I imagine few users would want to put licenses in their >>>>>> own >>>>>> images. >>>>>> >>>>>> On Tue, Apr 28, 2020 at 5:36 PM Hannah Jiang <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> The above 1,2,3,4 are merged to master. >>>>>>> Here I would like to change behaviors with *docker-pull-licenses* >>>>>>> tag and would like to collect feedbacks from community before >>>>>>> finalizing my >>>>>>> PR. >>>>>>> >>>>>>> docker-pull-licenses can be set to one of ['add', 'pull', 'skip']. >>>>>>> >>>>>>> - *docker-pull-licenses=add* pulls licenses/notices and the >>>>>>> files are added to docker images. This tag is used to create release >>>>>>> docker >>>>>>> images. >>>>>>> - *docker-pull-licenses=pull* pulls licenses/notices but do not >>>>>>> add the files to docker images so that create lightweight images. >>>>>>> This is >>>>>>> the default mode, and docker images created by Jenkins test use this >>>>>>> mode. >>>>>>> This will make sure the files are all pullable and detect issues >>>>>>> with the >>>>>>> tool, except the ADD part at DockerFile, which is rarely likely to >>>>>>> be an >>>>>>> issue. It is better than the checking URLs approach mentioned above, >>>>>>> because it is a more similar process like creating release images >>>>>>> and code >>>>>>> can be simplified so that make it easier to maintain. >>>>>>> - *docker-pull-licenses=skip* skips all license pulling related >>>>>>> tasks. This tag is provided for users who customize their images with >>>>>>> DockerFile provided by us and would not like to deal with license >>>>>>> stuff. >>>>>>> Without this tag, the users should change the .gradle file to get >>>>>>> rid of >>>>>>> it. It also can be used for local development when developers want >>>>>>> to solve >>>>>>> license related issues later. >>>>>>> >>>>>>> I would like to see this tag is adopted by all docker images >>>>>>> released by Beam, including Flink and Spark job server images, and share >>>>>>> the same default mode. >>>>>>> Does this sound good? Are there any concerns? >>>>>>> >>>>>>> Please let me know what you think. >>>>>>> Hannah >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Thu, Apr 16, 2020 at 11:46 AM Hannah Jiang < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> Ok, so the fianl approach is >>>>>>>> 1. Using multi threading with 16 threads. >>>>>>>> 2. Introduce docker-pull-license tag, by default it is disabled. >>>>>>>> 3. By default, only check if urls return 200, not actually pull the >>>>>>>> files to reduce image size. Licenses add 85MB of image size as of >>>>>>>> today. >>>>>>>> 4. Add retries to avoid flakiness. Initially it was added to >>>>>>>> download part only, assuming urlib is stable when validates urls, but >>>>>>>> it is >>>>>>>> not true. >>>>>>>> >>>>>>>> For caching, it is definitely useful. It will mitigate flakiness >>>>>>>> further, like when github is down (many urls point to github). It also >>>>>>>> reduces unnecessary internet traffic. >>>>>>>> Let's reconsider it after the current work is merged to 2.21.0. >>>>>>>> >>>>>>>> Thank you all for providing feedback and ideas. >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Apr 16, 2020 at 10:24 AM Kyle Weaver <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> > I tried with multi processing and it improved the performance a >>>>>>>>> lot. >>>>>>>>> >>>>>>>>> Great! Though it won't help with flakes, so as you said we should >>>>>>>>> still look into caching as well. >>>>>>>>> >>>>>>>>> > You could try using a threadpool >>>>>>>>> >>>>>>>>> +1 >>>>>>>>> >>>>>>>>> > However, we would like to include this work as part of 2.21.0 >>>>>>>>> >>>>>>>>> I had marked the jira as a blocker for 2.21.0 because I was afraid >>>>>>>>> something was broken, but now it looks like the failures were just >>>>>>>>> flakes. >>>>>>>>> So BEAM-9764 <https://issues.apache.org/jira/browse/BEAM-9764> should >>>>>>>>> not be a release blocker. >>>>>>>>> >>>>>>>>> On Thu, Apr 16, 2020 at 1:00 PM Thomas Weise <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi Hannah, >>>>>>>>>> >>>>>>>>>> Thanks for investigating! >>>>>>>>>> >>>>>>>>>> I think it would be great to eliminate the overhead for local >>>>>>>>>> builds (by default turn off the license assembly) and make it as >>>>>>>>>> lightweight as possible in the frequent CI path. >>>>>>>>>> >>>>>>>>>> Thomas >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Thu, Apr 16, 2020 at 1:37 AM Hannah Jiang < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> I tried to check if urls are valid instead of pulling the files >>>>>>>>>>> and it reduced only 1 min of running time. So, it's not an option >>>>>>>>>>> here. >>>>>>>>>>> >>>>>>>>>>> I tried with multi processing and it improved the performance a >>>>>>>>>>> lot. >>>>>>>>>>> With 12 subprocesses, the running time reduced to 49 seconds, >>>>>>>>>>> and with 16 cores, it reduced to 18 seconds. >>>>>>>>>>> The number of subprocesses is defined by the number of cores, >>>>>>>>>>> and Jenkins machine has 16 cores. >>>>>>>>>>> FYI: with my local machine (12 cores) and home network, it takes >>>>>>>>>>> 1min 40 secs to create a Java docker image. >>>>>>>>>>> >>>>>>>>>>> The caching approach mentioned by Robert brings many benefits, >>>>>>>>>>> not only to this use case. >>>>>>>>>>> However, we would like to include this work as part of 2.21.0, >>>>>>>>>>> so I will move with the multi processing approach this time. >>>>>>>>>>> >>>>>>>>>>> Please let me know if you have objections. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Wed, Apr 15, 2020 at 4:01 PM Robert Bradshaw < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>>> Is the cost primarily in pulling these remote licenses/sources? >>>>>>>>>>>> I'd guess that 99.9% of the URLs remain the same from run to run. >>>>>>>>>>>> Would a >>>>>>>>>>>> simple cache, or caching proxy, be sufficient? >>>>>>>>>>>> >>>>>>>>>>>> Otherwise, a tag to check that licenses can be pulled, but not >>>>>>>>>>>> really pull them, might be sufficient. (Making sure the default is >>>>>>>>>>>> cheap >>>>>>>>>>>> but we don't accidentally omit them when it matters is the tricky >>>>>>>>>>>> bit I see >>>>>>>>>>>> here.) >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Wed, Apr 15, 2020 at 3:38 PM Hannah Jiang < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Thanks for providing feedback. >>>>>>>>>>>>> >>>>>>>>>>>>> Here is what happending now and I would discuss when to run >>>>>>>>>>>>> the job. >>>>>>>>>>>>> >>>>>>>>>>>>> *Why it takes 7-8 mins for Java?* >>>>>>>>>>>>> When we list dependencies both in runtime and compile >>>>>>>>>>>>> environment, there are almost 1400 third party dependencies and >>>>>>>>>>>>> we need to >>>>>>>>>>>>> pull licenses/notices for all of them. >>>>>>>>>>>>> In addition, we need to pull source code if license is CDDL, >>>>>>>>>>>>> MPL, GLP or LGPL. 69 of the dependencies need to pull the source >>>>>>>>>>>>> code as of >>>>>>>>>>>>> 4/14/2020. >>>>>>>>>>>>> Getting dependency list + pulling licenses/notices/source code >>>>>>>>>>>>> takes 7-8 minutes. >>>>>>>>>>>>> >>>>>>>>>>>>> Now I see there are *two patterns of failures*. >>>>>>>>>>>>> 1. In valid URLs. In fact, the urls are not invalid, but >>>>>>>>>>>>> occationally, it returns URLError. This can be resolved by adding >>>>>>>>>>>>> retries. >>>>>>>>>>>>> However, it will add runtime to the job. >>>>>>>>>>>>> 2. No artifacts available. Sometimes, when a new version of >>>>>>>>>>>>> package is released and the plugin still looks for staging >>>>>>>>>>>>> location. For >>>>>>>>>>>>> example, new zetasql packages were released on 4/14, and today I >>>>>>>>>>>>> saw >>>>>>>>>>>>> several failures with looking for staging repo. The behavior is >>>>>>>>>>>>> not >>>>>>>>>>>>> consistent, sometimes it scans correct location, sometimes not. >>>>>>>>>>>>> This can be >>>>>>>>>>>>> resolved by running the job again. >>>>>>>>>>>>> >>>>>>>>>>>>> *When the job is running?* >>>>>>>>>>>>> generateThirdPartyLicenses is added to :sdks:java:container >>>>>>>>>>>>> and it is an upstream of the docker task. As such, whenever a >>>>>>>>>>>>> docker is >>>>>>>>>>>>> created, the job is triggered. >>>>>>>>>>>>> :sdks:java:container:docker is added to Java PreSubmit job. >>>>>>>>>>>>> >>>>>>>>>>>>> *How to improve it?* >>>>>>>>>>>>> According to some ideas provided above, how about doing this? >>>>>>>>>>>>> Introduce a tag (ie: pull-licenses) to docker job to decide if >>>>>>>>>>>>> pull the files. Default tag is NOT setting pull-licenses. >>>>>>>>>>>>> When pull-licenses is not set, it checks if the >>>>>>>>>>>>> licenses/notices/source code can be pull automaticall or they >>>>>>>>>>>>> have urls to >>>>>>>>>>>>> pull from, but don't really pull. >>>>>>>>>>>>> When pull-license is set, files are pulled. >>>>>>>>>>>>> >>>>>>>>>>>>> For each PR (Presubmit): applying default option. The test >>>>>>>>>>>>> would fail if the files cannot be pulled, so committers still >>>>>>>>>>>>> need to fix >>>>>>>>>>>>> dependency errors. I believe it would reduce the running time. >>>>>>>>>>>>> For release: set the tag and pull the files and source code. >>>>>>>>>>>>> Since it is checked for each PR, pulling should finish without >>>>>>>>>>>>> problems. >>>>>>>>>>>>> >>>>>>>>>>>>> Please let me know what you think and if there are other >>>>>>>>>>>>> things can be improved. >>>>>>>>>>>>> >>>>>>>>>>>>> Hannah >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, Apr 15, 2020 at 2:30 PM Kyle Weaver < >>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Looks like the same error as this Jira: >>>>>>>>>>>>>> https://issues.apache.org/jira/browse/BEAM-9764 >>>>>>>>>>>>>> >>>>>>>>>>>>>> Even if/when we are able to fix this particular issue, I >>>>>>>>>>>>>> agree it is best not to run this job except for releases because >>>>>>>>>>>>>> of the >>>>>>>>>>>>>> inherent network cost and possible reliability issues. +Hannah >>>>>>>>>>>>>> Jiang <[email protected]> What do you think? >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, Apr 15, 2020 at 5:20 PM Thomas Weise <[email protected]> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> The new feature to assemble licenses is very useful but >>>>>>>>>>>>>>> appears to add several minutes (7-8?) build time to jobs that >>>>>>>>>>>>>>> need to >>>>>>>>>>>>>>> build a container. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Does it also seem to cause occasional build failures? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Phrase/131/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Would it be possible to perform this task only during >>>>>>>>>>>>>>> release builds? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> Thomas >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>
