[jira] [Commented] (FLINK-33373) Capture build scans on ge.apache.org to benefit from deep build insights

Clay Johnson (Jira) Fri, 27 Oct 2023 06:26:08 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-33373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17780369#comment-17780369
 ]


Clay Johnson commented on FLINK-33373:
--------------------------------------

Good questions [~mapohl]!
{quote}I'm not that familiar with Develocity

It looks like it supports Maven projects (like Flink) along Gradle projects. Is 
this correct?
{quote}
If it helps, Develocity was originally called Gradle Enterprise, but was 
recently name Develocity since it supports Maven projects as well as Gradle 
projects. Support for Bazel and sbt are also being implemented.
{quote}it connects with the CI build (in our case Azure CI)
{quote}
Correct, although it is also able to connect to local builds if an ASF 
committer opts-in by creating an [access 
key|https://docs.gradle.com/enterprise/maven-extension/#automated_access_key_provisioning]
 on their own machine.
{quote}scans the build artifacts
{quote}
I wouldn't quite say that it scans the build "artifacts". It is not doing deep 
analysis on the jar files produced by the build, for example. Essentially, it 
is monitoring the build for events and then uploading them as a build scan at 
the completion of the build, so that you can get the build scan and understand 
what all happened during a build.
{quote}Does it provide other features along the ones I mentioned above?
{quote}
Yes, there is much data in the build scan. In addition to what you mentioned, 
there is detailed information on dependency downloads and resolution, which can 
be helpful in investigating dependency resolution issues. There is also basic 
metadata, such as the infrastructure of the build machine, which become much 
more useful when comparing two builds against each other to see how they differ.

In addition, by aggregating all of this data across many builds, you can start 
to understand your build performance over time as well as the behavior of tests 
and the common reasons for build failures. This data aggregation can help 
determine how best to address any pain points in your build. One feature that 
has resonated with many ASF teams that we have talked to is the detection and 
reporting on flaky tests.

There are also performance features, such as caching, test distribution, and 
predictive test selection that can be future areas to explore in order to make 
builds faster.
{quote}Is this something that is advised to run on {{master}} and the release 
branches only? Or would this also be something that could be enabled for PRs?
{quote}
We recommend enabling this for all builds, including PR builds. Accumulating as 
much data as possiible gives the best picture of build times, test failures, 
etc. One caveat with PRs though is that (at least on GitHub with Actions) is 
that PRs may not have access to secrets when run from forks. So PR builds from 
contributor forks may not have the permission to upload the scan at the end of 
the build. We are working on ways to address this.
{quote}Can we set up a test run where we could evaluate how it works with Flink 
before merging it to {{{}master{}}}?
{quote}
Yes you could. We would need to get ASF Infra to create a CI user specifically 
for Flink. Then, whoever can do so on Azure, would need to set the key as a 
secret in Azure. The PR already expects a `GE_ACCESS_KEY` to exist. From there, 
a build should publish scans. I verified this on my Azure account. We'd be 
happy to help out with that test.
{quote}We're currently looking into migrating from Azure CI to Github Actions 
(FLINK-27075). But I assume that wouldn't be such a problem for the 
ge.apache.org integration, would it? Would we be able to preserve the history 
already gathered by the Azure CI runs when moving to Github Actions?
{quote}
Develocity is agnostic when it comes to the CI tool used, so either Azure CI or 
GitHub Actions will work just fine. Currently, ge.apache.org is retaining all 
data, so you would keep access to the Azure data. Migration is one case where 
Develocity can help, especially through the build scan comparison feature. It 
can be helpful to compare builds that ran on one CI vs the other to see what 
may be different about them.

> Capture build scans on ge.apache.org to benefit from deep build insights
> ------------------------------------------------------------------------
>
>                 Key: FLINK-33373
>                 URL: https://issues.apache.org/jira/browse/FLINK-33373
>             Project: Flink
>          Issue Type: Improvement
>          Components: Build System / CI
>            Reporter: Clay Johnson
>            Assignee: Clay Johnson
>            Priority: Minor
>              Labels: pull-request-available
>
> This improvement will enhance the functionality of the Flink build by 
> publishing build scans to [ge.apache.org|https://ge.apache.org/], hosted by 
> the Apache Software Foundation and run in partnership between the ASF and 
> Gradle. This Develocity instance has all features and extensions enabled and 
> is freely available for use by the Apache Flink project and all other Apache 
> projects.
> On this Develocity instance, Apache Flink will have access not only to all of 
> the published build scans but other aggregate data features such as:
>  * Dashboards to view all historical build scans, along with performance 
> trends over time
>  * Build failure analytics for enhanced investigation and diagnosis of build 
> failures
>  * Test failure analytics to better understand trends and causes around slow, 
> failing, and flaky tests



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-33373) Capture build scans on ge.apache.org to benefit from deep build insights

Reply via email to