Hi Tiago,
I fully agree with that procedure (and I think skipping the issue for small
ones, like the one I just opened, is also the way to go)

On Thu, Aug 1, 2024 at 7:13 PM Tiago Bento <[email protected]> wrote:

> Paolo, I understand your proposal, I'm just concerned that the build
> sheriff role keeps rotating around the same people, as not everyone is
> available to volunteer and/or act as such. I know it's not your
> intention, but it's something that can happen. We can't expect all
> contributors to express interest in being the build sheriff for a
> month, nor that this will be something we can maintain running
> sustainably.
>
> Francisco, filing a new issue and fixing the problem on a separate PR
> is indeed the way to go, IMHO. What we usually do on `kie-tools` is:
> 1. Send a PR with a change.
> 2. Observe red PR checks, unrelated to the changes introduced.
> 3. Open an issue and send a separate PR targeting the same branch of
> the original PR, fixing the problem on the PR checks.
> 4. Review and merge this second PR, closing the new issue.
> 5. Retrigger PR checks on the original PR.
> 6. Observe a green build, review and merge it normally.
>
> * Sometimes we skip opening an issue, if the effort to fix it is small
> enough, and we can use the PR description to provide enough context
> for reviewers and watchers of the repo.
>
> The important thing, IMHO, is that the original PR doesn't get merged
> before the unrelated issue on PR checks is fixed. Otherwise we open a
> credit line that allows us to fall into tech debt :) My view has
> always been that we need to collectively cherish our CI, PR checks and
> automations, seeing those as the canonical way to build our software.
> But it's been really hard to cherish something so distant from our
> day-to-day work, especially when we all can, to some extent, continue
> operating the same way we've been for the last who knows how many
> years, somewhat ignoring the systems we currently have :/
>
> On Thu, Aug 1, 2024 at 11:54 AM Francisco Javier Tirado Sarti
> <[email protected]> wrote:
> >
> > I forgot to mention that another topic that is difficult to fix but easy
> to
> > discuss is if the Jenkins machines executing the test are properly
> > dimensioned for the test we are executing.
> > For example, in the previous PR,  the timeout to startup the Keycloak
> > quarkus instance IT test was increased to 2 minutes, because the default
> of
> > 1 minute does not seem to be enough for downloading and running the
> > keycloak image.
> > The CI is already taking ages.
> > Either we increase our HW resources for testing, or we start reducing our
> > test scope.
> >
> > On Thu, Aug 1, 2024 at 5:42 PM Francisco Javier Tirado Sarti <
> > [email protected]> wrote:
> >
> > > By the way I opened
> > > https://github.com/apache/incubator-kie-kogito-examples/pull/1991 for
> > > fixing
> > >
> https://ci-builds.apache.org/job/KIE/job/kogito/job/10.0.x/job/nightly/job/kogito-examples.build-and-deploy/17/
> > > So it can be said that I acted as sheriff, but since Im weak and cannot
> > > hold the pressure, I pass the torch (or the start) to the next one ;)
> > >
> > >
> > > On Thu, Aug 1, 2024 at 5:37 PM Francisco Javier Tirado Sarti <
> > > [email protected]> wrote:
> > >
> > >> Hi Tiago,
> > >> About point 2, when the issue blocking the merge is really unrelated,
> it
> > >> won't be a better approach to open a separate issue to fix the
> unrelated
> > >> issue?
> > >> I think we agree that is better for tracking (so you do not see an
> > >> unrelated change in a PR history) and will avoid the undesired
> situation of
> > >> two developers trying to fix the same unrelated issue from two
> simultaneous
> > >> PRs (one of the two eventually has to trigger the rebase and realize
> the
> > >> broken test is already fixed, but still, there are less chances of
> them
> > >> working in the same problem if there is an issue in the issue list)
> > >>
> > >>
> > >> On Thu, Aug 1, 2024 at 4:58 PM Tiago Bento <[email protected]>
> wrote:
> > >>
> > >>> Thanks Paolo for starting this conversation. Let me bring a little
> bit
> > >>> of my perspective to it.
> > >>>
> > >>> Although I agree that having people "dedicated" to the quality and
> > >>> stability of our CI and other automations would be better than what
> we
> > >>> have today, having our builds break so often that we need a system in
> > >>> place to deal with them is a symptom of other problems, IMHO.
> > >>>
> > >>> The complexity of our CI systems and automations is discouraging for
> > >>> most people to get involved. Without the system itself changing and
> > >>> being more approachable, having "build sheriffs" will only make the
> > >>> separation between "development" and "CI" bigger, and we'll be
> reliant
> > >>> on a small group of people who'll become solely responsible for
> either
> > >>> fixing stuff other people broke, or chasing them to fix it. When
> > >>> inevitably these experts can't or simply don't want to contribute to
> > >>> this area of the community anymore, we're in big trouble.
> > >>>
> > >>> My opinion is that we could try and concentrate our efforts to reduce
> > >>> the barrier of entry to maintaining the CI and automations we have,
> > >>> while putting a system in place that will naturally have each one of
> > >>> us know at least the basics of how the CI and automations work.
> > >>>
> > >>> From my experience maintaining `kie-tools`, a few things help
> reaching
> > >>> that point:
> > >>> 1. Having local builds be as similar as possible to CI builds. No
> > >>> fancy commands or profiles that only run on CI.
> > >>> 2. Red PRs can't be merged. Ever. If your PR became red for
> "unrelated
> > >>> reasons", you then become responsible to fix the "unrelated issue",
> > >>> helping everyone else not face the same problem.
> > >>> 3. Having a CI system with the least amount of abstractions possible.
> > >>> Less CI code == less cognitive load == smaller barrier of entry.
> > >>>
> > >>> Moving away from Jenkins for PR checks and concentrating on GitHub
> > >>> Actions is, IMHO, already a great step in that direction.
> > >>>
> > >>> I hope I could bring something positive to the discussion.
> > >>>
> > >>> Thanks!
> > >>>
> > >>> Regards,
> > >>>
> > >>> Tiago Bento
> > >>>
> > >>> On Thu, Aug 1, 2024 at 10:08 AM Gabriele Cardosi
> > >>> <[email protected]> wrote:
> > >>> >
> > >>> > Thanks for clarification, Paolo!
> > >>> >
> > >>> > Il giorno gio 1 ago 2024 alle ore 15:46 Paolo Bizzarri <
> > >>> [email protected]>
> > >>> > ha scritto:
> > >>> >
> > >>> > > Hi Gabriele,
> > >>> > >
> > >>> > > it is a mix of various stuff.
> > >>> > >
> > >>> > > For example, take the various issues that I reported in the
> analysis
> > >>> done
> > >>> > > for 10.x branch. Most of them apply just the same for the main
> > >>> branch.
> > >>> > >
> > >>> > > For example
> > >>> > >
> > >>> > >
> > >>>
> https://ci-builds.apache.org/job/KIE/job/kogito/job/main/job/tools/job/kogito-clean-old-nightly-images/
> > >>> > >
> > >>> > > Now this is probably a build that has to be just deleted - but
> still
> > >>> it is
> > >>> > > always red, and we need someone that looks at it and decide that
> > >>> yes, we
> > >>> > > need to get rid of it, create a corresponding kie issue and go
> after
> > >>> it.
> > >>> > >
> > >>> > > Another example:
> > >>> > >
> > >>> > >
> > >>>
> https://ci-builds.apache.org/job/KIE/job/kogito/job/10.0.x/job/nightly/job/kogito-examples.build-and-deploy/17/
> > >>> > >
> > >>> > > This test has been failing almost every day in the last few days.
> > >>> Either we
> > >>> > > need to make it a little more stable, or get rid of it.
> > >>> > >
> > >>> > > And so on.
> > >>> > >
> > >>> > > The goal of the sheriff is to keep the top level folder in good
> > >>> health, and
> > >>> > > that means that all the underlying jobs are healthy.
> > >>> > >
> > >>> > > I hope this clarifies my proposal.
> > >>> > >
> > >>> > > Regards
> > >>> > >
> > >>> > > Paolo
> > >>> > >
> > >>> > >
> > >>> > >
> > >>> > > On Thu, Aug 1, 2024 at 3:18 PM Gabriele Cardosi <
> > >>> > > [email protected]>
> > >>> > > wrote:
> > >>> > >
> > >>> > > > Hi Paolo,
> > >>> > > > may you explain exactly what you mean with "builds are often
> > >>> broken" ?
> > >>> > > May
> > >>> > > > you give an example of such and, in the example, what should
> the
> > >>> > > "sheriff"
> > >>> > > > do to manage it ? (Sorry, I just need to understand what you
> are
> > >>> > > referring
> > >>> > > > to)
> > >>> > > >
> > >>> > > > Thanks!
> > >>> > > >
> > >>> > > > Il giorno gio 1 ago 2024 alle ore 15:09 Paolo Bizzarri <
> > >>> > > [email protected]>
> > >>> > > > ha scritto:
> > >>> > > >
> > >>> > > > > Hello kie mates,
> > >>> > > > >
> > >>> > > > > please find my proposal in the following.
> > >>> > > > >
> > >>> > > > > PROBLEM
> > >>> > > > > - builds are often broken and they stay broken for a long
> time.
> > >>> There
> > >>> > > > seem
> > >>> > > > > to be not a clear definition of who should take care of this
> > >>> > > > >
> > >>> > > > > CONTEXT
> > >>> > > > > - fixing builds is slow, annoying and tipically is more a
> job of
> > >>> > > chasing
> > >>> > > > > someone else than fixing it yourself. So it becomes quickly
> > >>> wearing.
> > >>> > > > >
> > >>> > > > > PROPOSED SOLUTION
> > >>> > > > > - identify a number of build sheriffs that look at the
> various
> > >>> builds,
> > >>> > > > open
> > >>> > > > > the relevant issues for tracking and chase other devs and
> > >>> contributors
> > >>> > > to
> > >>> > > > > fix the issues themselves. The sheriffs are not supposed to
> fix
> > >>> > > > everything
> > >>> > > > > by themselves, but instead to keep the attention of other
> > >>> developers on
> > >>> > > > the
> > >>> > > > > status of the builds.
> > >>> > > > > I suggest we have three sheriffs, that stay around for one
> > >>> month and
> > >>> > > > then
> > >>> > > > > pass the token to someone else: one for drools and
> optaplanner,
> > >>> one for
> > >>> > > > > kogito, one for kie-tools.
> > >>> > > > >
> > >>> > > > > Let me know your ideas and feedback.
> > >>> > > > >
> > >>> > > > > Regards
> > >>> > > > >
> > >>> > > > > Paolo
> > >>> > > > >
> > >>> > > >
> > >>> > >
> > >>>
> > >>> ---------------------------------------------------------------------
> > >>> To unsubscribe, e-mail: [email protected]
> > >>> For additional commands, e-mail: [email protected]
> > >>>
> > >>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Reply via email to