I really like Jens's line of thought. Rather than focusing on the negative side, try to figure out a way to make it work :). That's very inspiring.
I think the original slack proposal of Jens was rather brittle, but it made me think that we can actually implement what we need rather quickly. Following the `Dependabot` style @rebase - we could implement - very easily - there are ready actions for that and other projects like apache beam are using similar workflows - the comment workflow, where maintainer who wants to merge (or auto-merge) such "failing" PR would just comment it appropriately. What we really want is to make the "finalize" check succeed, so if the PR misses the "Finalize" step, a committer would just have to comment it with "@bypass-finalize` or `@finalize` or similar. And what it would do, it would use Github API to make the "Finalize" check succeed. There is an API for that. That has a few nice properties: * it uses the same GitHub UI we use, no need to have an "escape hatch" and separate remote * it follows the pattern we already agreed to some time ago and generally follow -> we are supposed (as maintainers) to explain when we are merging such failing PR to show that this is a deliberate action - I usually for example always add "fixed already in main" or "Fixed already by #..." comment - you really need a comment in this case explaining that it was not accidental "merge". * so if we have such workflow, it would "force" maintainer to make a deliberate decision "yes I want to merge it despite failing" and make a comment in the PR * it would only set the "finalize" check to succeed, so still you would not be able to merge it without having an approval - this would only bypass the "finalize" check WDYT? Would that address all concerns with auto-merge ? Does it make sense and is it "easy enough" to follow? Or are there any other concerns? On Wed, Apr 30, 2025 at 10:36 PM Pavankumar Gopidesu < gopidesupa...@gmail.com> wrote: > I am also in line with Jens. The auto-merge feature works well in keeping > the CI pipeline green before merging. > However, there are situations where we need to merge changes quickly. > Instead of using an > escape route (which I’m not a fan of, as it carries risks if something goes > wrong—even though we can revert, it’s still a concern), > we could explore ways to dynamically bypass certain checks when necessary > or maybe some help from INFRA? > > Pavan > > On Wed, Apr 30, 2025 at 9:22 PM Jens Scheffler <j_scheff...@gmx.de.invalid > > > wrote: > > > As there was a call for more opinions. Here I am :-D > > > > I understand both positions. As I like AutoMerge very much I am not > > giving up :-D I'd like to keep it. > > > > I think there is still an option in between. Maybe need to draft a bit > > of thoughts but I think we could build something still around the > > limitations allowing automation. > > > > Added a minor thought to the slack chat. But if community rules propose > > to push to devchennel let me know. > > > > Jens > > > > On 30.04.25 11:04, Jarek Potiuk wrote: > > > Yeah, And I would encourage everyone to try it and provide feedback > while > > > it is enabled. > > > > > > So far we identified a few things (and fix them) that made it > borderline > > > unusable (bug in workflow for UI-only changes, GitHub starting to > 429-us > > > with too many requests, and the mysterious "hanging" of the latest > > > botocore/celery (?) "special tests" -> all that is already addressed in > > > main, and those issues should not happen (hopefully), so I'd say we can > > now > > > "truly" see how it might work. > > > > > > And one comment from my side - indeed, I find it nice actually, but > it's > > > definitely not a "deal breaker"- so if others find it too disruptive, I > > am > > > definitely not going to die on this hill, I just thought it does > improve > > > the workflow in the way that it allows for mostly "fire and forget" > when > > > you approve the workflow, one thing that it definitely improves is that > > you > > > do not have to remember about coming back to merge a request when CI > > > succeeds. > > > > > > J. > > > > > > On Wed, Apr 30, 2025 at 10:58 AM Kaxil Naik <kaxiln...@gmail.com> > wrote: > > > > > >> Forgot to note an additional point in Summary: If we find anything > > blocking > > >> us in that period, we will merge > > >> https://github.com/apache/airflow/pull/50009 to disable auto-merge. > > >> > > >> On Wed, 30 Apr 2025 at 14:26, Kaxil Naik <kaxiln...@gmail.com> wrote: > > >> > > >>> Jarek & I discussed it on Slack in #internal-airflow-ci-cd. Summary > > >> below: > > >>> I have a PR to disable it: > > https://github.com/apache/airflow/pull/50009. > > >>> However, given that many countries will be on holiday on May 1 due to > > >>> Labour Day, and some teething issues were fixed yesterday, we will > let > > it > > >>> run for a few more days so other committers and contributors can get > a > > >>> chance to try it out and share their experience after the > > >> experiment/trial > > >>> is concluded. > > >>> > > >>> > > >>> > > >>> On Wed, 30 Apr 2025 at 13:59, Kaxil Naik <kaxiln...@gmail.com> > wrote: > > >>> > > >>>> Whoops yeah. > > >>>> > > >>>>> Yep. Because it did not have all conversations resolved. We also > have > > >>>> "require resolved conversation" as one of the branch protection > > >>>> conditions. > > >>>> I resolved the conversation and it got merged automatically. > > >>>> > > >>>> Let's adapt it when ready though, I don't see any urgency of getting > > >>>> enabling auto-merge or getting it contributed immediately to asf > INFRA > > >> when > > >>>> it isn't critical. It is about priortization > > >>>> > > >>>> I'd say - if that is really bothering us - let's shape the feature > > >> rather > > >>>>> than outright reject it. > > >>>> > > >>>> On Wed, 30 Apr 2025 at 12:37, Jarek Potiuk <ja...@potiuk.com> > wrote: > > >>>> > > >>>>> On Wed, Apr 30, 2025 at 7:55 AM Kaxil Naik <kaxiln...@gmail.com> > > >> wrote: > > >>>>>> To the point that the original PR is still not merged even after I > > >> had > > >>>>>> re-triggered the failed tests yesterday: > > >>>>>> https://github.com/apache/airflow/pull/49727 > > >>>>>> > > >>>>>> > > >>>>> Yep. Because it did not have all conversations resolved. We also > have > > >>>>> "require resolved conversation" as one of the branch protection > > >>>>> conditions. > > >>>>> I resolved the conversation and it got merged automatically. > > >>>>> > > >>>>> > > >>>>>> On Wed, 30 Apr 2025 at 11:20, Kaxil Naik <kaxiln...@gmail.com> > > >> wrote: > > >>>>>>> The gitbox escape hatch isn't it though -- if we are to allow > that > > >>>>> why > > >>>>>> not > > >>>>>>> just allow people to merge it directly from github that to go via > > >> an > > >>>>>>> "escape hatch". > > >>>>> Generally speaking GitHub has this option. Currently "admins" have > a > > >>>>> possibility of overriding branch protection (via UI). And it would > be > > >>>>> possible - if INFRA will allow it - to possibly add an .asf.yaml > > >> feature > > >>>>> to > > >>>>> also allow branch protection override for all committers or a > subset > > of > > >>>>> the > > >>>>> committers (PMC Members ? ). This is more of a limitation of the > > >> current > > >>>>> implementation of permissions than a missing feature. If we all > feel > > >> that > > >>>>> the gitbox escape hatch is not enough, we can likely even > contribute > > >>>>> such a > > >>>>> feature to .asf.yaml - if INFRA will be ok with the option. It's > very > > >>>>> easy > > >>>>> to contribute to - INFRA made it possible, we have a new framework: > > >>>>> https://github.com/apache/infrastructure-asfyaml - we can even > > >> implement > > >>>>> "airflow-only" .asf.yaml feature, that will not be initially > > available > > >> to > > >>>>> other ASF projects and later we can promote it to be available to > > >>>>> everyone. > > >>>>> > > >>>>> I'd say - if that is really bothering us - let's shape the feature > > >> rather > > >>>>> than outright reject it. > > >>>>> > > >>>>> > > >>>>>>> I am -1 on this auto-merge feature > > >>>>>>> > > >>>>> Understood :). But let's give it a bit more time as well and maybe > > >>>>> improve > > >>>>> it (see above) - unless we really feel we are blocked now - then it > > >>>>> should > > >>>>> be as easy as merging an .asf.yaml change to disable it. > > >>>>> > > >>>>> > > >>>>>>> On Wed, 30 Apr 2025 at 11:18, Kaxil Naik <kaxiln...@gmail.com> > > >>>>> wrote: > > >>>>>>>> That’s not a single person, it impacts the committers and the PR > > >>>>> author > > >>>>>>>> involved too. I don’t see how team productivity soars here. > > >>>>>>>> > > >>>>>>>> On Wed, 30 Apr 2025 at 02:39, Jarek Potiuk <ja...@potiuk.com> > > >>>>> wrote: > > >>>>>>>>> But yes, I also miss the previous "merge because I think it's > > >> safe" > > >>>>>>>>> workflow. > > >>>>>>>>> > > >>>>>>>>> I badly miss it. Personally, It hurts my productivity. > > >>>>>>>>> > > >>>>>>>>> But I think the "require status check" to be green is great for > > >>>>> "team > > >>>>>>>>> productivity". Usually when single person is impacted more than > > >>>>> team in > > >>>>>>>>> general, it's worse for the person impacted, but team > > >> productivity > > >>>>>> soars. > > >>>>>>>>> J. > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> On Tue, Apr 29, 2025 at 11:03 PM Jarek Potiuk < > ja...@potiuk.com> > > >>>>>> wrote: > > >>>>>>>>>> Just to add comment: > > >>>>>>>>>> > > >>>>>>>>>> a) there was some instability of "celery/boto" hanging tests > > >>>>> today > > >>>>>>>>> that is > > >>>>>>>>>> rather difficult to address - but we worked around it by just > > >>>>>> removing > > >>>>>>>>>> "special-tests" from pre-requisite of merging > > >>>>>>>>>> b) GitHub today (like literally today!) started to be picky on > > >>>>> "too > > >>>>>>>>> many > > >>>>>>>>>> requests" - I addressed it today for both helm chart and our > > >>>>> release > > >>>>>>>>> tests > > >>>>>>>>>> (we are using bearer-token to authenticate and increase the > > >>>>> limit - > > >>>>>>>>> and we > > >>>>>>>>>> added cache on downloading json schema that was downloaded > > >>>>> "per-test" > > >>>>>>>>>> c) in cases like the one mentioned above with intermittent > > >>>>> failures - > > >>>>>>>>>> simple "rerun failed jobs" (assuming it will succeed after > > >>>>> rerun) - > > >>>>>> is > > >>>>>>>>>> essentially equivalent of "merge" (unless it fails again which > > >>>>> for me > > >>>>>>>>> is a > > >>>>>>>>>> signal of "DO NOT MERGE") > > >>>>>>>>>> d) we always have the "gitbox" escape hatch - that allows any > > >>>>>>>>> committer to > > >>>>>>>>>> push the fix directly, bypassing the limits: > > >>>>>>>>>> > > >>>>>>>>>> This is a simple thing for committers: > > >>>>>>>>>> > > >>>>>>>>>> git add remote gitbox > > >>>>>> https://gitbox.apache.org/repos/asf/airflow.git > > >>>>>>>>>> git fetch gitbox > > >>>>>>>>>> git commit --amend ("add #PR number") > > >>>>>>>>>> git push gitbox BRANCH_NAME:main (you need to provide your > > >>>>> apache id > > >>>>>>>>> and > > >>>>>>>>>> password) > > >>>>>>>>>> > > >>>>>>>>>> This is a nice escape hatch that we can use as "exceptional > > >>>>>> workflow" - > > >>>>>>>>>> and it works - I did it quite a few times over the last few > > >>>>> days. Not > > >>>>>>>>> UI > > >>>>>>>>>> controlled, but IMHO exceptional workflow should be - well - > > >>>>>>>>> exceptional. > > >>>>>>>>>> J. > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> On Tue, Apr 29, 2025 at 8:52 PM Kaxil Naik < > > >> kaxiln...@gmail.com> > > >>>>>>>>> wrote: > > >>>>>>>>>>> Similar experience as Elad, I am in favor of disabling it > tbh. > > >>>>> For > > >>>>>>>>>>> example, > > >>>>>>>>>>> https://github.com/apache/airflow/pull/49727 has a failing > > >>>>> test as > > >>>>>>>>> below > > >>>>>>>>>>> -- > > >>>>>>>>>>> which is not an issue, and test passes locally so I would > want > > >>>>> to > > >>>>>>>>> merge it > > >>>>>>>>>>> but I can't. > > >>>>>>>>>>> > > >>>>>>>>>>> FAILED > > >>>>>>>>>>> > > >>>>>>>>>>> > > >> > > > helm-tests/tests/helm_tests/airflow_aux/test_basic_helm_chart.py::TestBaseChartTest::test_priority_classes > > >>>>>>>>>>> - requests.exceptions.HTTPError: 429 Client Error: Too Many > > >>>>> Requests > > >>>>>>>>> for > > >>>>>>>>>>> url: > > >>>>>>>>>>> > > >>>>>>>>>>> > > >> > > > https://raw.githubusercontent.com/yannh/kubernetes-json-schema/master/v1.29.1-standalone-strict/priorityclass-scheduling-v1.json > > >>>>>>>>>>> On Tue, 29 Apr 2025 at 18:29, Jarek Potiuk <ja...@potiuk.com > > > > >>>>>> wrote: > > >>>>>>>>>>>> On Tue, Apr 29, 2025 at 1:46 PM Elad Kalif < > > >>>>> elad...@apache.org> > > >>>>>>>>> wrote: > > >>>>>>>>>>>>> Thanks for that Jarek! > > >>>>>>>>>>>>> I find the lack of ability to merge PRs fast very limiting > > >>>>> but > > >>>>>> it > > >>>>>>>>>>> might > > >>>>>>>>>>>> be > > >>>>>>>>>>>>> just something to get used to. > > >>>>>>>>>>>>> > > >>>>>>>>>>>> Indeed. I also see it, but also I got a few manually pushed > > >>>>> "must > > >>>>>>>>> fix > > >>>>>>>>>>>> quickly" to gitbox, and actually I find it really nice - > > >>>>> because > > >>>>>> it > > >>>>>>>>> is > > >>>>>>>>>>>> still possible, but it require some extra effort and > > >>>>> deliberate > > >>>>>> "ok > > >>>>>>>>> that > > >>>>>>>>>>>> one really should be pushed to unblock everyone" - as long > > >> as > > >>>>> we > > >>>>>> all > > >>>>>>>>>>>> (especially those people that are active in the > > >>>>>>>>>>>> #internal-airfow-ci-cd channel) know how to do it and can > > >> fix > > >>>>>> things > > >>>>>>>>>>>> quickly, this is actually a nice way to make it into > > >>>>> "exceptional" > > >>>>>>>>>>> workflow > > >>>>>>>>>>>> - which will push us more in making sure airflow main is > > >>>>> really > > >>>>>>>>> "green" > > >>>>>>>>>>> - > > >>>>>>>>>>>> which ultimately is our goal to make it as green as possible > > >>>>> all > > >>>>>> the > > >>>>>>>>>>> time. > > >>>>>>>>>>>> What **might help with that** (and also keeping the "enable > > >>>>> auto > > >>>>>>>>> merge" > > >>>>>>>>>>>> might motivate it more to) is to: > > >>>>>>>>>>>> > > >>>>>>>>>>>> * speed up the builds - we MUST prioritise now ARC (K8S > > >>>>>> self-hosted > > >>>>>>>>>>>> runners) to make our builds simply faster - I started a > > >>>>> discussion > > >>>>>>>>> and a > > >>>>>>>>>>>> small group of people who can work together to complete it > > >>>>> after > > >>>>>>>>>>> Hussein's > > >>>>>>>>>>>> POC) > > >>>>>>>>>>>> * speed up the image release - with ARM runners (which we > > >>>>> might be > > >>>>>>>>> able > > >>>>>>>>>>> to > > >>>>>>>>>>>> do independently as recently I think we have > > >>>>> hypervisor-enabled > > >>>>>> ARM > > >>>>>>>>>>> images > > >>>>>>>>>>>> available as public runners as github made it generally > > >>>>>> available). > > >>>>>>>>>>>> * speed up the doc builds for airflow-site - we (mainly > > >>>>> Pavan) are > > >>>>>>>>>>> close to > > >>>>>>>>>>>> complete offloading of the historical release docs to S3 > > >> and I > > >>>>>> hope > > >>>>>>>>> it > > >>>>>>>>>>> will > > >>>>>>>>>>>> cut down a lot on doc publishing workflows. > > >>>>>>>>>>>> > > >>>>>>>>>>>> J, > > >>>>>>>>>>>> > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org > > For additional commands, e-mail: dev-h...@airflow.apache.org > > > > >